Commit Graph

485 Commits

Author SHA1 Message Date
yuhui
8eb4e48521 !1608 新增gemma系列模型ST及UT
Merge pull request !1608 from yuhui/gemma2_ut
2024-09-11 08:08:23 +00:00
shengjy
23798b49ca !1638 update Mixtral 8x22b evaluation on mmlu
Merge pull request !1638 from shengjy/master
2024-09-10 12:28:14 +00:00
glhyy
1f4ab545cc !1597 新增权重转换ut模板和mixtral用例,支持legacy和mcore互转
Merge pull request !1597 from glhyy/master
2024-09-10 07:40:15 +00:00
LeiZhenzhen
4c1123e5fb !1642 mod partial_rope to glm_rope
Merge pull request !1642 from LeiZhenzhen/master
2024-09-10 06:14:35 +00:00
RuanZhiXiang
b3142dd5e5 !1639 switch mindspeed dependency into e6ea2117
Merge pull request !1639 from RuanZhiXiang/solid-commid-id-e6ea2117
2024-09-10 03:50:35 +00:00
sunjunjie
58e8133311 !1622 权重转换代码位置优化&修复反向依赖
Merge pull request !1622 from sunjunjie/ckpt_position
2024-09-09 06:37:36 +00:00
丁子叉
de65e81113 !1636 [mcore-llm]类deepseek2模型训练性能优化
Merge pull request !1636 from 丁子叉/deepseek2-fa
2024-09-09 06:31:52 +00:00
丁子叉
579b4ed2e5 !1632 [mcore-llm]类deepseek2模型新增README说明
Merge pull request !1632 from 丁子叉/master
2024-09-09 06:29:34 +00:00
shenjiarun
a4b59e34e4 !1623 新增chatglm3 8B mcore 全参微调Loss对齐脚本和修改Llama2 7B推理和评估脚本
Merge pull request !1623 from shenjiarun/master
2024-09-09 02:32:00 +00:00
RuanZhiXiang
959a537602 !1635 fix: 解决chatglm3 64K精度问题
Merge pull request !1635 from RuanZhiXiang/fix-chatglm3
2024-09-09 00:57:14 +00:00
yuhui
587eaf005b !1631 glm4参数更正
Merge pull request !1631 from yuhui/glm4_fix
2024-09-07 01:25:24 +00:00
闻江
edc8ed5181 !1628 MiniCPM-2B/MiniCPM-MoE-8x2B适配
Merge pull request !1628 from 闻江/master
2024-09-06 13:35:28 +00:00
yuhui
9c6a1b52b5 !1626 gemma2模型支持mcore转hf
Merge pull request !1626 from yuhui/weight_convert
2024-09-06 03:11:43 +00:00
丁子叉
7857e68009 !1609 [mcore-llm]类deepseek2模型新增多机预训练启动脚本
Merge pull request !1609 from 丁子叉/deepseek2-sh
2024-09-06 01:02:15 +00:00
yuhui
6827ff2bb1 !1619 新增glm4模型适配
Merge pull request !1619 from yuhui/glm4
2024-09-05 13:05:30 +00:00
Peihan Liu
a8186f30f3 !1592 支持moe-GEMM融合算子
Merge pull request !1592 from Peihan Liu/master
2024-09-05 09:32:24 +00:00
guozhihua
9d773d2a3d !1612 修改grok成4机配置
Merge pull request !1612 from guozhihua/master
2024-09-05 03:24:20 +00:00
shengjy
8d88f91af8 !1613 修复Mixtral预训练加载权重
Merge pull request !1613 from shengjy/master
2024-09-05 02:09:37 +00:00
xiongliangcheng
0843e925a2 !1611 【Mcore】添加yi-34B和codellama34B适配Mcore
Merge pull request !1611 from xiongliangcheng/q3
2024-09-05 01:11:02 +00:00
丁子叉
deea5209a9 !1605 [mcore-llm]类deepseekv2模型增加hf2mg,mg2hf权重转换与预训练数据集处理
Merge pull request !1605 from 丁子叉/deepseek2-ckpt_convert
2024-09-05 01:00:05 +00:00
AresLzk
8d0cb96404 !1590 新模型CodeQwen1.5-7B适配mcore分支
Merge pull request !1590 from AresLzk/master
2024-09-04 10:50:44 +00:00
shenjiarun
2f2e086708 !1586 新增Llama2 7B 全参微调Loss对齐脚本
Merge pull request !1586 from shenjiarun/master
2024-09-04 06:45:50 +00:00
shengjy
c305fe4e85 !1601 新增gpt4 moe dropless
Merge pull request !1601 from shengjy/master
2024-09-04 01:13:13 +00:00
songyuan-knighto7o4
dd917d652a !1607 【天翼云需求】新模型internlm2适配mcore
Merge pull request !1607 from songyuan-knighto7o4/master
2024-09-04 01:10:11 +00:00
guozhihua
30022c5c2a !1604 增加mixtral_8*7b推理评估
Merge pull request !1604 from guozhihua/master
2024-09-04 01:05:21 +00:00
wucong
bf6f398764 !1594 新增Llama2在Mcore上微调
Merge pull request !1594 from wucong/addMCorella2
2024-09-03 07:53:34 +00:00
LeiZhenzhen
01b71a2a2e !1580 add gpt4 moe drop
Merge pull request !1580 from LeiZhenzhen/master
2024-09-03 06:15:24 +00:00
yuhui
705158ab41 !1568 新增gemma2系列模型适配
Merge pull request !1568 from yuhui/gemma2
2024-09-03 01:57:47 +00:00
guozhihua
a647db6449 !1589 增加llama31的200b和405b全参
Merge pull request !1589 from guozhihua/master
2024-09-03 01:04:17 +00:00
shengjy
cf19b46add !1555 新增 Mixtral 8x22B 预训练、推理和评估
Merge pull request !1555 from shengjy/master
2024-09-02 03:43:11 +00:00
商元义
9f8c96894d !1579 添加Qwen2-7B适配
Merge pull request !1579 from 商元义/qwen2-7b
2024-08-31 06:13:56 +00:00
changlei
4d00313fbd !1582 新增Qwen2-0.5B模型
Merge pull request !1582 from changlei/master
2024-08-31 01:58:39 +00:00
丁子叉
cee50d023e !1578 [mcore-llm]类deepseekv2模型性能优化:无tp场景下的大词表mm本地切分,并支持mla场景的指定softmax_scale
Merge pull request !1578 from 丁子叉/master
2024-08-31 01:13:32 +00:00
RuanZhiXiang
143128b9e3 !1569 feat: upgrade mindspeed to core060
Merge pull request !1569 from RuanZhiXiang/core060-update
2024-08-30 06:24:06 +00:00
RuanZhiXiang
86ff5d73ed !1571 Del: remove moe legacy and its system tests
Merge pull request !1571 from RuanZhiXiang/moe-legacy-fix
2024-08-30 06:21:07 +00:00
changlei
cd878c44cb !1474 适配Qwen2-1.5B模型
Merge pull request !1474 from changlei/master
2024-08-30 06:14:52 +00:00
sunjunjie
d8caa66dd1 !1573 权重转换新框架支持qwen、gemma的legacy和huggingface互转,移 除qwen、gemma旧代码
Merge pull request !1573 from sunjunjie/master_ckpt
2024-08-30 03:07:41 +00:00
glhyy
476ded1c12 !1566 权重转换新框架支持bloom,移除bloom、chatglm3和mixtral旧代码
Merge pull request !1566 from glhyy/master
2024-08-29 11:51:32 +00:00
RuanZhiXiang
11a7ccabbd !1576 fix: chatglm3 arguments
Merge pull request !1576 from RuanZhiXiang/load-chatglm
2024-08-29 11:24:31 +00:00
商元义
431f3aa0f4 !1543 添加Qwen1.5-110B适配
Merge pull request !1543 from 商元义/qwen15-110b
2024-08-29 00:55:47 +00:00
yuhui
696d907f35 !1570 gemma-2b mcore适配
Merge pull request !1570 from yuhui/gemma_core
2024-08-28 07:58:32 +00:00
yuhui
7bb74c79c2 !1539 gemma-7b mcore适配
Merge pull request !1539 from yuhui/gemma_core
2024-08-28 01:38:03 +00:00
AresLzk
238d120a6f !1477 新模型Qwen2_72B适配mcore分支
Merge pull request !1477 from AresLzk/master
2024-08-27 12:19:55 +00:00
DONGHAORAN
a08bb1cd12 !1553 完善example readme
Merge pull request !1553 from DONGHAORAN/dev
2024-08-23 07:39:24 +00:00
yuhui
216b3ceeac !1532 新增llama3.1-70b模型
Merge pull request !1532 from yuhui/llama31
2024-08-23 02:33:55 +00:00
LeiZhenzhen
6f6b00ad74 !1525 chatglm3 ckpt hf2mcore
Merge pull request !1525 from LeiZhenzhen/master
2024-08-22 10:07:56 +00:00
sunjunjie
603b1a84ec !1536 完善example/README,统一以llama2-7b为例
Merge pull request !1536 from sunjunjie/master
2024-08-22 03:51:34 +00:00
wucong
14851ae631 !1542 增加过滤格式不规范数据逻辑
Merge pull request !1542 from wucong/fixData
2024-08-22 03:25:25 +00:00
丁子叉
416a473c83 !1541 [mcore-llm]新增类deepseekv2-lite模型预训练
Merge pull request !1541 from 丁子叉/master
2024-08-22 02:21:33 +00:00
wucong
99f66458a2 !1478 微调与Llamafactory差异修改
Merge pull request !1478 from wucong/fituneFix
2024-08-20 02:43:08 +00:00
zhangjianxiang
ccbe74f1d2 !1517 维护版本栏增加bk_origin_23版本
Merge pull request !1517 from zhangjianxiang/master
2024-08-19 11:34:30 +00:00
glhyy
03f23d1ce3 !1487 数据预处理加速
Merge pull request !1487 from glhyy/master
2024-08-16 08:13:07 +00:00
wucong
6fad9bd24b !1507 修改微调readme
Merge pull request !1507 from wucong/modifyread
2024-08-14 04:51:57 +00:00
wucong
652574b6c4 !1455 prompt-type推理适配
Merge pull request !1455 from wucong/addgen
2024-08-13 06:53:26 +00:00
sunjunjie
e060f047d3 !1489 新增llama2-7b legacy的lora权重与原始权重合并脚本
Merge pull request !1489 from sunjunjie/master
2024-08-12 07:00:05 +00:00
wucong
bad362b7cc !1488 添加微调数据预处理readme
Merge pull request !1488 from wucong/addreadme
2024-08-09 11:27:59 +00:00
zhangjianxiang
afe8196a55 !1444 在线推理适配FA和KV_Cache
Merge pull request !1444 from zhangjianxiang/KV_CACHE_IFA_PFA
2024-08-08 01:20:29 +00:00
ningbenzhe1
187ba09058 !1480 llama3.1 405b脚本上传
Merge pull request !1480 from ningbenzhe1/master
2024-08-07 12:20:29 +00:00
fengliangjun
7537c0ed5e
update examples/README.md.
bugfix

Signed-off-by: fengliangjun <fengliangjun@huawei.com>
2024-08-07 12:13:29 +00:00
glhyy
b144a73859 !1473 新增数据集合并功能(支持预训练和微调)
Merge pull request !1473 from glhyy/master
2024-08-07 12:12:23 +00:00
丁子叉
2531205c0f !1476 【mcore-LLM大模型】新增类deepseekv2模型:支持MLA,YaRN,DeepSeekMoE模型结构
Merge pull request !1476 from 丁子叉/master
2024-08-06 09:35:36 +00:00
代维华
8be9648acf !1459 新增llama3.1 8B模型
Merge pull request !1459 from 代维华/master
2024-08-05 09:34:31 +00:00
shengjy
e79fc1081f !1452 adapt llama2 to mcore
Merge pull request !1452 from shengjy/master
2024-08-02 01:09:46 +00:00
ningbenzhe1
a780cb5953 !1451 支持Mistral-7B使用mcore r0.6.0结构
Merge pull request !1451 from ningbenzhe1/master
2024-08-02 01:05:38 +00:00
WangYu
34d2bbb412 !1460 支持chatglm3-6B使用mcore r0.6.0结构
Merge pull request !1460 from WangYu/chatglm3
2024-08-01 10:52:21 +00:00
fengliangjun
e1550bf4e9
rename examples/llama3/ckpt_convert_llama3_8b_legacy2hf.sh
改名

Signed-off-by: fengliangjun <fengliangjun@huawei.com>
2024-07-29 12:17:49 +00:00
fengliangjun
7ef7176884
rename examples/llama3/ckpt_convert_llama3_8b_hf2legacy.sh
删除参数

Signed-off-by: fengliangjun <fengliangjun@huawei.com>
2024-07-29 12:17:22 +00:00
fengliangjun
9252dc9136 !1462 删除模型readme,更新统一readme
* update readme
2024-07-29 12:13:08 +00:00
guozhihua
137becb3b7 !1453 修改grok1的双机配置
Merge pull request !1453 from guozhihua/gzh_grok1
2024-07-25 07:02:28 +00:00
fengliangjun
d5046f3262 !1448 删除冗余内容
* remove redundancy
2024-07-23 09:54:53 +00:00
guozhihua
917598bd9c !1447 增加mixtral_mcore,增加mcore的mc2以及优化dispatcher亲和化操作argsort
Merge pull request !1447 from guozhihua/gzh_mixtral_q3
2024-07-22 14:01:20 +00:00
guoxinjie
a27b45f001 !1438 订正主页 readme 信息
Merge pull request !1438 from guoxinjie/master
2024-07-20 09:45:35 +00:00
guoxinjie
fac5389415 !1415 指定 GPT3 脚本 tokenizer
Merge pull request !1415 from guoxinjie/master
2024-07-20 06:14:43 +00:00
sunjunjie
380bc9f50c !1426 增加环境安装和预训练readme,调整grok脚本路径
Merge pull request !1426 from sunjunjie/master
2024-07-20 06:11:34 +00:00
LeiZhenzhen
6115150a6c !1432 chatglm3 readme更新
Merge pull request !1432 from LeiZhenzhen/master
2024-07-20 02:40:55 +00:00
商元义
6e09b17776 !1418 解决lora后推理不加载权重问题
Merge pull request !1418 from 商元义/master
2024-07-17 02:02:11 +00:00
代维华
1d6bb88d82 !1405 moe框架开发 & grok1模型开发
Merge pull request !1405 from 代维华/master
2024-07-15 01:23:47 +00:00
liuyanghan
9fed432e08 !1406 untie & 动态PP & lora融合特性合入
Merge pull request !1406 from liuyanghan/master
2024-07-12 06:01:51 +00:00
yuhui
0762f87e9a !1401 更正gemma模型readme
Merge pull request !1401 from yuhui/master
2024-07-09 02:58:17 +00:00
商元义
a58a2b1447 !1383 Qwen1.5 readme问题单修改
Merge pull request !1383 from 商元义/master
2024-06-27 09:43:53 +00:00
fengliangjun
f80514ad86 !1373 修改typos和issue
Merge pull request !1373 from fengliangjun/master
2024-06-24 13:00:04 +00:00
商元义
c90c9b107f !1361 修复Qwen1.5问题
Merge pull request !1361 from 商元义/master
2024-06-24 02:34:45 +00:00
guoxinjie
e83af7c2bd !1364 修复 bloom 精度问题
Merge pull request !1364 from guoxinjie/bloom_fix
2024-06-22 07:39:37 +00:00
zhangjianxiang
6f28e4e589 !1363 修改mixtral预训练脚本的global-batch-size参数
Merge pull request !1363 from zhangjianxiang/mixtral
2024-06-22 06:34:16 +00:00
商元义
baf8f2237f !1349 修复Qwen1.5错误
Merge pull request !1349 from 商元义/master
2024-06-20 07:29:12 +00:00
liuyanghan
2fcaaacf87 !1356 权重转换特性看护 megatron格式转megatron格式
Merge pull request !1356 from liuyanghan/master
2024-06-20 04:18:47 +00:00
LeiZhenzhen
a47f94f2a9 !1345 chatglm3性能优化/增加微调功能
Merge pull request !1345 from LeiZhenzhen/master
2024-06-18 04:03:01 +00:00
商元义
39dea7f9e8 !1331 添加Qwen1.5-0.5B适配
Merge pull request !1331 from 商元义/master
2024-06-17 06:04:05 +00:00
wucong
e5b5121c17 !1334 修复LLAMA3-8B 的 预训练权重和词表下载路径
Merge pull request !1334 from wucong/fixUrl
2024-06-17 02:50:37 +00:00
DONGHAORAN
27ad511d3c !1342 脚本内删除overlap-param-gather,删除无效网址链接和无效markdown
Merge pull request !1342 from DONGHAORAN/master
2024-06-12 12:42:28 +00:00
guhangsong
d5a1d0dd13 !1329 ModelLink配套升级到megatron core 0.6.0
Merge pull request !1329 from guhangsong/upversion
2024-06-11 07:53:57 +00:00
glhyy
f78de57a6e !1332 readme环境和硬件信息更新,已知问题修正
Merge pull request !1332 from glhyy/master
2024-06-06 12:34:03 +00:00
商元义
2d7482b887 !1321 添加Qwen1.5-1.8B适配
Merge pull request !1321 from 商元义/1.8b
2024-06-04 01:23:52 +00:00
liujianxing
d13e6a4f93 !1292 统一规范参数格式(noisy_gate_policy设置为noisy-gate-policy,保持和ascenspeed一致)
Merge pull request !1292 from liujianxing/format_arguments
2024-06-03 11:00:01 +00:00
yuhui
df13d04012 !1320 gemma模型参数优化
Merge pull request !1320 from yuhui/master
2024-06-03 08:14:09 +00:00
changlei
dc91af86c0 !1313 修复Qwen1.5模型评估结果为0
Merge pull request !1313 from changlei/master
2024-06-03 08:12:16 +00:00
sunjunjie
474ef61511 !1271 修改AscendSpeed为MindSpeed
Merge pull request !1271 from sunjunjie/dev
2024-06-03 07:39:44 +00:00
商元义
379f6f2836 !1274 添加Qwen1.5-72B适配
Merge pull request !1274 from 商元义/master
2024-06-03 00:58:16 +00:00
guoyiwei111
deb8b2ebce !1235 实现Lora权重合入HuggingFace权重
Merge pull request !1235 from guoyiwei111/master
2024-05-31 07:51:23 +00:00
商元义
9d2b66e89e !1288 添加Qwen1.5-32B适配
Merge pull request !1288 from 商元义/32B
2024-05-30 07:06:31 +00:00
glhyy
17b97452f7 !1308 readme中图片链接改为绝对路径,修复其他readme笔误
Merge pull request !1308 from glhyy/master
2024-05-29 07:18:53 +00:00
wucong
0bfc31c528 !1303 llama2-34b脚本参数修改,性能优化
Merge pull request !1303 from wucong/llama2_34b
2024-05-29 03:40:18 +00:00
wucong
27ac11d642 !1290 Llama2-70b参数优化 + 权重转换bug修复
Merge pull request !1290 from wucong/llama270b
2024-05-29 01:58:38 +00:00
taobohao
f31514a83a !1298 添加Qwen1.5-4B适配
Merge pull request !1298 from taobohao/qwen15-4b
2024-05-28 01:10:54 +00:00
guoxinjie
62c40eef76 !1240 GPT3-175B 整理上库
Merge pull request !1240 from guoxinjie/gelu
2024-05-27 02:34:20 +00:00
zhangbin
371a159c4d !1310 llama3 更新readme
Merge pull request !1310 from zhangbin/master
2024-05-25 07:52:58 +00:00
yuhui
bf2342dad8 !1302 新增Gemma-2B模型适配
Merge pull request !1302 from yuhui/master
2024-05-24 06:17:28 +00:00
黄宇豪
f19ce463a8 !1299 feat: 添加 Aquila2-34B 模型适配
Merge pull request !1299 from 黄宇豪/master
2024-05-23 09:31:32 +00:00
商元义
19d3b157ff !1291 添加Qwen1.5-14B适配
Merge pull request !1291 from 商元义/14B
2024-05-23 02:57:38 +00:00
黄宇豪
4151c53e20 !1296 fix: 修复 Aquila2-7B UT失败问题
Merge pull request !1296 from 黄宇豪/master
2024-05-21 13:07:03 +00:00
changlei
e16f08ae1e !1293 Qwen1.5-7B增加推理图片和UT报错修复
Merge pull request !1293 from changlei/master
2024-05-21 13:04:19 +00:00
yuhui
60b3b077db !1269 新增Gemma-7B模型适配
Merge pull request !1269 from yuhui/master
2024-05-21 12:59:27 +00:00
shishaoyu
78c4397f6c !1257 支持Mistral 7B 32K长序列模型
Merge pull request !1257 from shishaoyu/master
2024-05-17 08:09:23 +00:00
changlei
0141fec762 !1281 添加Qwen1.5-7B适配
Merge pull request !1281 from changlei/master
2024-05-17 07:48:44 +00:00
glhyy
b6d946d835 !1287 readme笔误修改
Merge pull request !1287 from glhyy/master
2024-05-17 06:47:52 +00:00
fengliangjun
4a683f8dbe !1285 更新 mixtral-moe 模型至32K
Merge pull request !1285 from fengliangjun/master
2024-05-17 01:32:15 +00:00
wwzhuo
cf6e8f4a9c !1283 更正llama2 7b部分参数
Merge pull request !1283 from wwzhuo/master
2024-05-16 11:53:22 +00:00
glhyy
dd86f13dc0 !1280 增加非共享储存情况下非主节点数据缓存检测和生成
Merge pull request !1280 from glhyy/master
2024-05-16 07:40:36 +00:00
黄宇豪
a1f7e94b22 !1282 fix: 修复错误的微调数据集输入路径
Merge pull request !1282 from 黄宇豪/master
2024-05-15 11:30:31 +00:00
黄宇豪
ce19d7e3a2 !1273 feat: 添加 Aquila2-7B 适配
Merge pull request !1273 from 黄宇豪/master
2024-05-14 09:17:09 +00:00
leiguodong
85943f047c !1261 添加codellama-34B适配
Merge pull request !1261 from leiguodong/master
2024-05-14 08:19:47 +00:00
xiongliangcheng
6ac3959ae0 !1268 添加yi-34B模型适配
Merge pull request !1268 from xiongliangcheng/master
2024-05-14 03:54:20 +00:00
zhangbin
1d6d2d354d !1278 llama3 更新readme
Merge pull request !1278 from zhangbin/master
2024-05-14 03:26:27 +00:00
wucong
03211525c0 !1264 统一 readme 格式(llama2)
Merge pull request !1264 from wucong/dev8
2024-05-07 02:20:30 +00:00
wucong
a8bf1c55c5 !1263 统一 readme 格式(llama_en + qwen)
Merge pull request !1263 from wucong/dev7
2024-05-07 02:20:14 +00:00
wucong
3fd657ad6e !1262 统一 readme 格式(llama)
Merge pull request !1262 from wucong/dev6
2024-05-07 02:20:00 +00:00
wucong
dc6db1f858 !1260 统一 readme 格式(chatglm3 + intern)
Merge pull request !1260 from wucong/dev5
2024-05-07 02:19:45 +00:00
wucong
297fe8b01b !1265 统一 readme 格式(llama3 + mixtral)
Merge pull request !1265 from wucong/dev9
2024-05-07 02:16:49 +00:00
guoxinjie
2ae8749f4a !1252 统一 readme 格式(aquila)
Merge pull request !1252 from guoxinjie/readme
2024-04-30 07:50:27 +00:00
wucong
ae21a622b8 !1254 统一 readme 格式(baichuan2 + bloom)
Merge pull request !1254 from wucong/dev2
2024-04-30 07:39:33 +00:00
Liuchang
9a3d5641f5 !1255 优化聊天功能,增加Llama3聊天脚本和说明
Merge pull request !1255 from Liuchang/master
2024-04-30 02:58:39 +00:00
wucong
4e62972ecd !1253 统一 readme 格式(baichuan)
Merge pull request !1253 from wucong/dev1
2024-04-29 03:52:08 +00:00
wwzhuo
b2915bd2ab !1238 更新llama2 7b/13b 性能最优配置
Merge pull request !1238 from wwzhuo/master
2024-04-26 08:43:14 +00:00
Liuchang
a9f905b63f !1251 Llama3 readme更新
Merge pull request !1251 from Liuchang/master
2024-04-26 07:27:08 +00:00
fengliangjun
791677c135 !1246 更新baichuan2-13B性能至1668
Merge pull request !1246 from fengliangjun/master
2024-04-26 01:47:52 +00:00
Liuchang
4109f95dfd !1242 新增Llama3-8B和70B模型
Merge pull request !1242 from Liuchang/master
2024-04-25 01:24:31 +00:00
guhangsong
39d6fd7336 !1218 迁移megatron patch
Merge pull request !1218 from guhangsong/patch
2024-04-23 01:57:03 +00:00
fengliangjun
464131283f !1239 去除FA适配时的一些冗余shape变换操作,提升性能
Merge pull request !1239 from fengliangjun/master
2024-04-18 01:42:50 +00:00
glhyy
75a81f58f9 !1233 README已知问题更新
Merge pull request !1233 from glhyy/master
2024-04-16 02:22:33 +00:00
LeiZhenzhen
5ad4ceddd4 !1231 对chatglm3增加partial_rope支持
Merge pull request !1231 from LeiZhenzhen/master
2024-04-15 13:11:56 +00:00
LeiZhenzhen
ab22271e13 !1227 新增chatglm3 预训练、推理、评估基线
Merge pull request !1227 from LeiZhenzhen/master
2024-04-11 03:23:33 +00:00
guoxinjie
2f32c76be2 !1224 移除 ModelLink 下的 megatron,并在 readme 中进行补充
Merge pull request !1224 from guoxinjie/remove_megatron
2024-04-09 07:44:00 +00:00
LeiZhenzhen
8524ea2735 !1225 增加chatglm3权重转换功能
Merge pull request !1225 from LeiZhenzhen/master
2024-04-09 06:05:25 +00:00
黄宇豪
e23e1e354b !1215 fix: 统一Mixtral-README为预训练模板
Merge pull request !1215 from 黄宇豪/master
2024-04-03 02:08:55 +00:00
fengliangjun
0df09cd187 !1202 添加profiling功能
* add profiling
2024-03-30 08:58:41 +00:00
黄宇豪
62c39ddb9b !1201 fix: 修复权重保存路径和数据集路径,格式化了README
Merge pull request !1201 from 黄宇豪/master
2024-03-30 06:32:18 +00:00
shishaoyu
ce01706c93 !1199 【DTS2024032814829】临时规避压测反复kill拉起情况下loss出现NaN的问题
Merge pull request !1199 from shishaoyu/master
2024-03-29 06:09:28 +00:00
黄宇豪
e8ae798db4 !1186 统一权重路径和README样式
Merge pull request !1186 from 黄宇豪/master
2024-03-28 03:43:17 +00:00
liuyanghan
aa6d2662cc !1177 多机训练下,数据加载问题说明
Merge pull request !1177 from liuyanghan/master
2024-03-28 01:07:00 +00:00
wwzhuo
24c423201b !1152 llama2 readme修改,更正tokenizer说明
* 修改llama2 readme中微调tokenizer变更说明
2024-03-27 08:25:03 +00:00