yuhui
|
8eb4e48521
|
!1608 新增gemma系列模型ST及UT
Merge pull request !1608 from yuhui/gemma2_ut
|
2024-09-11 08:08:23 +00:00 |
|
shengjy
|
23798b49ca
|
!1638 update Mixtral 8x22b evaluation on mmlu
Merge pull request !1638 from shengjy/master
|
2024-09-10 12:28:14 +00:00 |
|
glhyy
|
1f4ab545cc
|
!1597 新增权重转换ut模板和mixtral用例,支持legacy和mcore互转
Merge pull request !1597 from glhyy/master
|
2024-09-10 07:40:15 +00:00 |
|
LeiZhenzhen
|
4c1123e5fb
|
!1642 mod partial_rope to glm_rope
Merge pull request !1642 from LeiZhenzhen/master
|
2024-09-10 06:14:35 +00:00 |
|
RuanZhiXiang
|
b3142dd5e5
|
!1639 switch mindspeed dependency into e6ea2117
Merge pull request !1639 from RuanZhiXiang/solid-commid-id-e6ea2117
|
2024-09-10 03:50:35 +00:00 |
|
sunjunjie
|
58e8133311
|
!1622 权重转换代码位置优化&修复反向依赖
Merge pull request !1622 from sunjunjie/ckpt_position
|
2024-09-09 06:37:36 +00:00 |
|
丁子叉
|
de65e81113
|
!1636 [mcore-llm]类deepseek2模型训练性能优化
Merge pull request !1636 from 丁子叉/deepseek2-fa
|
2024-09-09 06:31:52 +00:00 |
|
丁子叉
|
579b4ed2e5
|
!1632 [mcore-llm]类deepseek2模型新增README说明
Merge pull request !1632 from 丁子叉/master
|
2024-09-09 06:29:34 +00:00 |
|
shenjiarun
|
a4b59e34e4
|
!1623 新增chatglm3 8B mcore 全参微调Loss对齐脚本和修改Llama2 7B推理和评估脚本
Merge pull request !1623 from shenjiarun/master
|
2024-09-09 02:32:00 +00:00 |
|
RuanZhiXiang
|
959a537602
|
!1635 fix: 解决chatglm3 64K精度问题
Merge pull request !1635 from RuanZhiXiang/fix-chatglm3
|
2024-09-09 00:57:14 +00:00 |
|
yuhui
|
587eaf005b
|
!1631 glm4参数更正
Merge pull request !1631 from yuhui/glm4_fix
|
2024-09-07 01:25:24 +00:00 |
|
闻江
|
edc8ed5181
|
!1628 MiniCPM-2B/MiniCPM-MoE-8x2B适配
Merge pull request !1628 from 闻江/master
|
2024-09-06 13:35:28 +00:00 |
|
yuhui
|
9c6a1b52b5
|
!1626 gemma2模型支持mcore转hf
Merge pull request !1626 from yuhui/weight_convert
|
2024-09-06 03:11:43 +00:00 |
|
丁子叉
|
7857e68009
|
!1609 [mcore-llm]类deepseek2模型新增多机预训练启动脚本
Merge pull request !1609 from 丁子叉/deepseek2-sh
|
2024-09-06 01:02:15 +00:00 |
|
yuhui
|
6827ff2bb1
|
!1619 新增glm4模型适配
Merge pull request !1619 from yuhui/glm4
|
2024-09-05 13:05:30 +00:00 |
|
Peihan Liu
|
a8186f30f3
|
!1592 支持moe-GEMM融合算子
Merge pull request !1592 from Peihan Liu/master
|
2024-09-05 09:32:24 +00:00 |
|
guozhihua
|
9d773d2a3d
|
!1612 修改grok成4机配置
Merge pull request !1612 from guozhihua/master
|
2024-09-05 03:24:20 +00:00 |
|
shengjy
|
8d88f91af8
|
!1613 修复Mixtral预训练加载权重
Merge pull request !1613 from shengjy/master
|
2024-09-05 02:09:37 +00:00 |
|
xiongliangcheng
|
0843e925a2
|
!1611 【Mcore】添加yi-34B和codellama34B适配Mcore
Merge pull request !1611 from xiongliangcheng/q3
|
2024-09-05 01:11:02 +00:00 |
|
丁子叉
|
deea5209a9
|
!1605 [mcore-llm]类deepseekv2模型增加hf2mg,mg2hf权重转换与预训练数据集处理
Merge pull request !1605 from 丁子叉/deepseek2-ckpt_convert
|
2024-09-05 01:00:05 +00:00 |
|
AresLzk
|
8d0cb96404
|
!1590 新模型CodeQwen1.5-7B适配mcore分支
Merge pull request !1590 from AresLzk/master
|
2024-09-04 10:50:44 +00:00 |
|
shenjiarun
|
2f2e086708
|
!1586 新增Llama2 7B 全参微调Loss对齐脚本
Merge pull request !1586 from shenjiarun/master
|
2024-09-04 06:45:50 +00:00 |
|
shengjy
|
c305fe4e85
|
!1601 新增gpt4 moe dropless
Merge pull request !1601 from shengjy/master
|
2024-09-04 01:13:13 +00:00 |
|
songyuan-knighto7o4
|
dd917d652a
|
!1607 【天翼云需求】新模型internlm2适配mcore
Merge pull request !1607 from songyuan-knighto7o4/master
|
2024-09-04 01:10:11 +00:00 |
|
guozhihua
|
30022c5c2a
|
!1604 增加mixtral_8*7b推理评估
Merge pull request !1604 from guozhihua/master
|
2024-09-04 01:05:21 +00:00 |
|
wucong
|
bf6f398764
|
!1594 新增Llama2在Mcore上微调
Merge pull request !1594 from wucong/addMCorella2
|
2024-09-03 07:53:34 +00:00 |
|
LeiZhenzhen
|
01b71a2a2e
|
!1580 add gpt4 moe drop
Merge pull request !1580 from LeiZhenzhen/master
|
2024-09-03 06:15:24 +00:00 |
|
yuhui
|
705158ab41
|
!1568 新增gemma2系列模型适配
Merge pull request !1568 from yuhui/gemma2
|
2024-09-03 01:57:47 +00:00 |
|
guozhihua
|
a647db6449
|
!1589 增加llama31的200b和405b全参
Merge pull request !1589 from guozhihua/master
|
2024-09-03 01:04:17 +00:00 |
|
shengjy
|
cf19b46add
|
!1555 新增 Mixtral 8x22B 预训练、推理和评估
Merge pull request !1555 from shengjy/master
|
2024-09-02 03:43:11 +00:00 |
|
商元义
|
9f8c96894d
|
!1579 添加Qwen2-7B适配
Merge pull request !1579 from 商元义/qwen2-7b
|
2024-08-31 06:13:56 +00:00 |
|
changlei
|
4d00313fbd
|
!1582 新增Qwen2-0.5B模型
Merge pull request !1582 from changlei/master
|
2024-08-31 01:58:39 +00:00 |
|
丁子叉
|
cee50d023e
|
!1578 [mcore-llm]类deepseekv2模型性能优化:无tp场景下的大词表mm本地切分,并支持mla场景的指定softmax_scale
Merge pull request !1578 from 丁子叉/master
|
2024-08-31 01:13:32 +00:00 |
|
RuanZhiXiang
|
143128b9e3
|
!1569 feat: upgrade mindspeed to core060
Merge pull request !1569 from RuanZhiXiang/core060-update
|
2024-08-30 06:24:06 +00:00 |
|
RuanZhiXiang
|
86ff5d73ed
|
!1571 Del: remove moe legacy and its system tests
Merge pull request !1571 from RuanZhiXiang/moe-legacy-fix
|
2024-08-30 06:21:07 +00:00 |
|
changlei
|
cd878c44cb
|
!1474 适配Qwen2-1.5B模型
Merge pull request !1474 from changlei/master
|
2024-08-30 06:14:52 +00:00 |
|
sunjunjie
|
d8caa66dd1
|
!1573 权重转换新框架支持qwen、gemma的legacy和huggingface互转,移 除qwen、gemma旧代码
Merge pull request !1573 from sunjunjie/master_ckpt
|
2024-08-30 03:07:41 +00:00 |
|
glhyy
|
476ded1c12
|
!1566 权重转换新框架支持bloom,移除bloom、chatglm3和mixtral旧代码
Merge pull request !1566 from glhyy/master
|
2024-08-29 11:51:32 +00:00 |
|
RuanZhiXiang
|
11a7ccabbd
|
!1576 fix: chatglm3 arguments
Merge pull request !1576 from RuanZhiXiang/load-chatglm
|
2024-08-29 11:24:31 +00:00 |
|
商元义
|
431f3aa0f4
|
!1543 添加Qwen1.5-110B适配
Merge pull request !1543 from 商元义/qwen15-110b
|
2024-08-29 00:55:47 +00:00 |
|
yuhui
|
696d907f35
|
!1570 gemma-2b mcore适配
Merge pull request !1570 from yuhui/gemma_core
|
2024-08-28 07:58:32 +00:00 |
|
yuhui
|
7bb74c79c2
|
!1539 gemma-7b mcore适配
Merge pull request !1539 from yuhui/gemma_core
|
2024-08-28 01:38:03 +00:00 |
|
AresLzk
|
238d120a6f
|
!1477 新模型Qwen2_72B适配mcore分支
Merge pull request !1477 from AresLzk/master
|
2024-08-27 12:19:55 +00:00 |
|
DONGHAORAN
|
a08bb1cd12
|
!1553 完善example readme
Merge pull request !1553 from DONGHAORAN/dev
|
2024-08-23 07:39:24 +00:00 |
|
yuhui
|
216b3ceeac
|
!1532 新增llama3.1-70b模型
Merge pull request !1532 from yuhui/llama31
|
2024-08-23 02:33:55 +00:00 |
|
LeiZhenzhen
|
6f6b00ad74
|
!1525 chatglm3 ckpt hf2mcore
Merge pull request !1525 from LeiZhenzhen/master
|
2024-08-22 10:07:56 +00:00 |
|
sunjunjie
|
603b1a84ec
|
!1536 完善example/README,统一以llama2-7b为例
Merge pull request !1536 from sunjunjie/master
|
2024-08-22 03:51:34 +00:00 |
|
wucong
|
14851ae631
|
!1542 增加过滤格式不规范数据逻辑
Merge pull request !1542 from wucong/fixData
|
2024-08-22 03:25:25 +00:00 |
|
丁子叉
|
416a473c83
|
!1541 [mcore-llm]新增类deepseekv2-lite模型预训练
Merge pull request !1541 from 丁子叉/master
|
2024-08-22 02:21:33 +00:00 |
|
wucong
|
99f66458a2
|
!1478 微调与Llamafactory差异修改
Merge pull request !1478 from wucong/fituneFix
|
2024-08-20 02:43:08 +00:00 |
|
zhangjianxiang
|
ccbe74f1d2
|
!1517 维护版本栏增加bk_origin_23版本
Merge pull request !1517 from zhangjianxiang/master
|
2024-08-19 11:34:30 +00:00 |
|
glhyy
|
03f23d1ce3
|
!1487 数据预处理加速
Merge pull request !1487 from glhyy/master
|
2024-08-16 08:13:07 +00:00 |
|
wucong
|
6fad9bd24b
|
!1507 修改微调readme
Merge pull request !1507 from wucong/modifyread
|
2024-08-14 04:51:57 +00:00 |
|
wucong
|
652574b6c4
|
!1455 prompt-type推理适配
Merge pull request !1455 from wucong/addgen
|
2024-08-13 06:53:26 +00:00 |
|
sunjunjie
|
e060f047d3
|
!1489 新增llama2-7b legacy的lora权重与原始权重合并脚本
Merge pull request !1489 from sunjunjie/master
|
2024-08-12 07:00:05 +00:00 |
|
wucong
|
bad362b7cc
|
!1488 添加微调数据预处理readme
Merge pull request !1488 from wucong/addreadme
|
2024-08-09 11:27:59 +00:00 |
|
zhangjianxiang
|
afe8196a55
|
!1444 在线推理适配FA和KV_Cache
Merge pull request !1444 from zhangjianxiang/KV_CACHE_IFA_PFA
|
2024-08-08 01:20:29 +00:00 |
|
ningbenzhe1
|
187ba09058
|
!1480 llama3.1 405b脚本上传
Merge pull request !1480 from ningbenzhe1/master
|
2024-08-07 12:20:29 +00:00 |
|
fengliangjun
|
7537c0ed5e
|
update examples/README.md.
bugfix
Signed-off-by: fengliangjun <fengliangjun@huawei.com>
|
2024-08-07 12:13:29 +00:00 |
|
glhyy
|
b144a73859
|
!1473 新增数据集合并功能(支持预训练和微调)
Merge pull request !1473 from glhyy/master
|
2024-08-07 12:12:23 +00:00 |
|
丁子叉
|
2531205c0f
|
!1476 【mcore-LLM大模型】新增类deepseekv2模型:支持MLA,YaRN,DeepSeekMoE模型结构
Merge pull request !1476 from 丁子叉/master
|
2024-08-06 09:35:36 +00:00 |
|
代维华
|
8be9648acf
|
!1459 新增llama3.1 8B模型
Merge pull request !1459 from 代维华/master
|
2024-08-05 09:34:31 +00:00 |
|
shengjy
|
e79fc1081f
|
!1452 adapt llama2 to mcore
Merge pull request !1452 from shengjy/master
|
2024-08-02 01:09:46 +00:00 |
|
ningbenzhe1
|
a780cb5953
|
!1451 支持Mistral-7B使用mcore r0.6.0结构
Merge pull request !1451 from ningbenzhe1/master
|
2024-08-02 01:05:38 +00:00 |
|
WangYu
|
34d2bbb412
|
!1460 支持chatglm3-6B使用mcore r0.6.0结构
Merge pull request !1460 from WangYu/chatglm3
|
2024-08-01 10:52:21 +00:00 |
|
fengliangjun
|
e1550bf4e9
|
rename examples/llama3/ckpt_convert_llama3_8b_legacy2hf.sh
改名
Signed-off-by: fengliangjun <fengliangjun@huawei.com>
|
2024-07-29 12:17:49 +00:00 |
|
fengliangjun
|
7ef7176884
|
rename examples/llama3/ckpt_convert_llama3_8b_hf2legacy.sh
删除参数
Signed-off-by: fengliangjun <fengliangjun@huawei.com>
|
2024-07-29 12:17:22 +00:00 |
|
fengliangjun
|
9252dc9136
|
!1462 删除模型readme,更新统一readme
* update readme
|
2024-07-29 12:13:08 +00:00 |
|
guozhihua
|
137becb3b7
|
!1453 修改grok1的双机配置
Merge pull request !1453 from guozhihua/gzh_grok1
|
2024-07-25 07:02:28 +00:00 |
|
fengliangjun
|
d5046f3262
|
!1448 删除冗余内容
* remove redundancy
|
2024-07-23 09:54:53 +00:00 |
|
guozhihua
|
917598bd9c
|
!1447 增加mixtral_mcore,增加mcore的mc2以及优化dispatcher亲和化操作argsort
Merge pull request !1447 from guozhihua/gzh_mixtral_q3
|
2024-07-22 14:01:20 +00:00 |
|
guoxinjie
|
a27b45f001
|
!1438 订正主页 readme 信息
Merge pull request !1438 from guoxinjie/master
|
2024-07-20 09:45:35 +00:00 |
|
guoxinjie
|
fac5389415
|
!1415 指定 GPT3 脚本 tokenizer
Merge pull request !1415 from guoxinjie/master
|
2024-07-20 06:14:43 +00:00 |
|
sunjunjie
|
380bc9f50c
|
!1426 增加环境安装和预训练readme,调整grok脚本路径
Merge pull request !1426 from sunjunjie/master
|
2024-07-20 06:11:34 +00:00 |
|
LeiZhenzhen
|
6115150a6c
|
!1432 chatglm3 readme更新
Merge pull request !1432 from LeiZhenzhen/master
|
2024-07-20 02:40:55 +00:00 |
|
商元义
|
6e09b17776
|
!1418 解决lora后推理不加载权重问题
Merge pull request !1418 from 商元义/master
|
2024-07-17 02:02:11 +00:00 |
|
代维华
|
1d6bb88d82
|
!1405 moe框架开发 & grok1模型开发
Merge pull request !1405 from 代维华/master
|
2024-07-15 01:23:47 +00:00 |
|
liuyanghan
|
9fed432e08
|
!1406 untie & 动态PP & lora融合特性合入
Merge pull request !1406 from liuyanghan/master
|
2024-07-12 06:01:51 +00:00 |
|
yuhui
|
0762f87e9a
|
!1401 更正gemma模型readme
Merge pull request !1401 from yuhui/master
|
2024-07-09 02:58:17 +00:00 |
|
商元义
|
a58a2b1447
|
!1383 Qwen1.5 readme问题单修改
Merge pull request !1383 from 商元义/master
|
2024-06-27 09:43:53 +00:00 |
|
fengliangjun
|
f80514ad86
|
!1373 修改typos和issue
Merge pull request !1373 from fengliangjun/master
|
2024-06-24 13:00:04 +00:00 |
|
商元义
|
c90c9b107f
|
!1361 修复Qwen1.5问题
Merge pull request !1361 from 商元义/master
|
2024-06-24 02:34:45 +00:00 |
|
guoxinjie
|
e83af7c2bd
|
!1364 修复 bloom 精度问题
Merge pull request !1364 from guoxinjie/bloom_fix
|
2024-06-22 07:39:37 +00:00 |
|
zhangjianxiang
|
6f28e4e589
|
!1363 修改mixtral预训练脚本的global-batch-size参数
Merge pull request !1363 from zhangjianxiang/mixtral
|
2024-06-22 06:34:16 +00:00 |
|
商元义
|
baf8f2237f
|
!1349 修复Qwen1.5错误
Merge pull request !1349 from 商元义/master
|
2024-06-20 07:29:12 +00:00 |
|
liuyanghan
|
2fcaaacf87
|
!1356 权重转换特性看护 megatron格式转megatron格式
Merge pull request !1356 from liuyanghan/master
|
2024-06-20 04:18:47 +00:00 |
|
LeiZhenzhen
|
a47f94f2a9
|
!1345 chatglm3性能优化/增加微调功能
Merge pull request !1345 from LeiZhenzhen/master
|
2024-06-18 04:03:01 +00:00 |
|
商元义
|
39dea7f9e8
|
!1331 添加Qwen1.5-0.5B适配
Merge pull request !1331 from 商元义/master
|
2024-06-17 06:04:05 +00:00 |
|
wucong
|
e5b5121c17
|
!1334 修复LLAMA3-8B 的 预训练权重和词表下载路径
Merge pull request !1334 from wucong/fixUrl
|
2024-06-17 02:50:37 +00:00 |
|
DONGHAORAN
|
27ad511d3c
|
!1342 脚本内删除overlap-param-gather,删除无效网址链接和无效markdown
Merge pull request !1342 from DONGHAORAN/master
|
2024-06-12 12:42:28 +00:00 |
|
guhangsong
|
d5a1d0dd13
|
!1329 ModelLink配套升级到megatron core 0.6.0
Merge pull request !1329 from guhangsong/upversion
|
2024-06-11 07:53:57 +00:00 |
|
glhyy
|
f78de57a6e
|
!1332 readme环境和硬件信息更新,已知问题修正
Merge pull request !1332 from glhyy/master
|
2024-06-06 12:34:03 +00:00 |
|
商元义
|
2d7482b887
|
!1321 添加Qwen1.5-1.8B适配
Merge pull request !1321 from 商元义/1.8b
|
2024-06-04 01:23:52 +00:00 |
|
liujianxing
|
d13e6a4f93
|
!1292 统一规范参数格式(noisy_gate_policy设置为noisy-gate-policy,保持和ascenspeed一致)
Merge pull request !1292 from liujianxing/format_arguments
|
2024-06-03 11:00:01 +00:00 |
|
yuhui
|
df13d04012
|
!1320 gemma模型参数优化
Merge pull request !1320 from yuhui/master
|
2024-06-03 08:14:09 +00:00 |
|
changlei
|
dc91af86c0
|
!1313 修复Qwen1.5模型评估结果为0
Merge pull request !1313 from changlei/master
|
2024-06-03 08:12:16 +00:00 |
|
sunjunjie
|
474ef61511
|
!1271 修改AscendSpeed为MindSpeed
Merge pull request !1271 from sunjunjie/dev
|
2024-06-03 07:39:44 +00:00 |
|
商元义
|
379f6f2836
|
!1274 添加Qwen1.5-72B适配
Merge pull request !1274 from 商元义/master
|
2024-06-03 00:58:16 +00:00 |
|
guoyiwei111
|
deb8b2ebce
|
!1235 实现Lora权重合入HuggingFace权重
Merge pull request !1235 from guoyiwei111/master
|
2024-05-31 07:51:23 +00:00 |
|
商元义
|
9d2b66e89e
|
!1288 添加Qwen1.5-32B适配
Merge pull request !1288 from 商元义/32B
|
2024-05-30 07:06:31 +00:00 |
|
glhyy
|
17b97452f7
|
!1308 readme中图片链接改为绝对路径,修复其他readme笔误
Merge pull request !1308 from glhyy/master
|
2024-05-29 07:18:53 +00:00 |
|
wucong
|
0bfc31c528
|
!1303 llama2-34b脚本参数修改,性能优化
Merge pull request !1303 from wucong/llama2_34b
|
2024-05-29 03:40:18 +00:00 |
|
wucong
|
27ac11d642
|
!1290 Llama2-70b参数优化 + 权重转换bug修复
Merge pull request !1290 from wucong/llama270b
|
2024-05-29 01:58:38 +00:00 |
|
taobohao
|
f31514a83a
|
!1298 添加Qwen1.5-4B适配
Merge pull request !1298 from taobohao/qwen15-4b
|
2024-05-28 01:10:54 +00:00 |
|
guoxinjie
|
62c40eef76
|
!1240 GPT3-175B 整理上库
Merge pull request !1240 from guoxinjie/gelu
|
2024-05-27 02:34:20 +00:00 |
|
zhangbin
|
371a159c4d
|
!1310 llama3 更新readme
Merge pull request !1310 from zhangbin/master
|
2024-05-25 07:52:58 +00:00 |
|
yuhui
|
bf2342dad8
|
!1302 新增Gemma-2B模型适配
Merge pull request !1302 from yuhui/master
|
2024-05-24 06:17:28 +00:00 |
|
黄宇豪
|
f19ce463a8
|
!1299 feat: 添加 Aquila2-34B 模型适配
Merge pull request !1299 from 黄宇豪/master
|
2024-05-23 09:31:32 +00:00 |
|
商元义
|
19d3b157ff
|
!1291 添加Qwen1.5-14B适配
Merge pull request !1291 from 商元义/14B
|
2024-05-23 02:57:38 +00:00 |
|
黄宇豪
|
4151c53e20
|
!1296 fix: 修复 Aquila2-7B UT失败问题
Merge pull request !1296 from 黄宇豪/master
|
2024-05-21 13:07:03 +00:00 |
|
changlei
|
e16f08ae1e
|
!1293 Qwen1.5-7B增加推理图片和UT报错修复
Merge pull request !1293 from changlei/master
|
2024-05-21 13:04:19 +00:00 |
|
yuhui
|
60b3b077db
|
!1269 新增Gemma-7B模型适配
Merge pull request !1269 from yuhui/master
|
2024-05-21 12:59:27 +00:00 |
|
shishaoyu
|
78c4397f6c
|
!1257 支持Mistral 7B 32K长序列模型
Merge pull request !1257 from shishaoyu/master
|
2024-05-17 08:09:23 +00:00 |
|
changlei
|
0141fec762
|
!1281 添加Qwen1.5-7B适配
Merge pull request !1281 from changlei/master
|
2024-05-17 07:48:44 +00:00 |
|
glhyy
|
b6d946d835
|
!1287 readme笔误修改
Merge pull request !1287 from glhyy/master
|
2024-05-17 06:47:52 +00:00 |
|
fengliangjun
|
4a683f8dbe
|
!1285 更新 mixtral-moe 模型至32K
Merge pull request !1285 from fengliangjun/master
|
2024-05-17 01:32:15 +00:00 |
|
wwzhuo
|
cf6e8f4a9c
|
!1283 更正llama2 7b部分参数
Merge pull request !1283 from wwzhuo/master
|
2024-05-16 11:53:22 +00:00 |
|
glhyy
|
dd86f13dc0
|
!1280 增加非共享储存情况下非主节点数据缓存检测和生成
Merge pull request !1280 from glhyy/master
|
2024-05-16 07:40:36 +00:00 |
|
黄宇豪
|
a1f7e94b22
|
!1282 fix: 修复错误的微调数据集输入路径
Merge pull request !1282 from 黄宇豪/master
|
2024-05-15 11:30:31 +00:00 |
|
黄宇豪
|
ce19d7e3a2
|
!1273 feat: 添加 Aquila2-7B 适配
Merge pull request !1273 from 黄宇豪/master
|
2024-05-14 09:17:09 +00:00 |
|
leiguodong
|
85943f047c
|
!1261 添加codellama-34B适配
Merge pull request !1261 from leiguodong/master
|
2024-05-14 08:19:47 +00:00 |
|
xiongliangcheng
|
6ac3959ae0
|
!1268 添加yi-34B模型适配
Merge pull request !1268 from xiongliangcheng/master
|
2024-05-14 03:54:20 +00:00 |
|
zhangbin
|
1d6d2d354d
|
!1278 llama3 更新readme
Merge pull request !1278 from zhangbin/master
|
2024-05-14 03:26:27 +00:00 |
|
wucong
|
03211525c0
|
!1264 统一 readme 格式(llama2)
Merge pull request !1264 from wucong/dev8
|
2024-05-07 02:20:30 +00:00 |
|
wucong
|
a8bf1c55c5
|
!1263 统一 readme 格式(llama_en + qwen)
Merge pull request !1263 from wucong/dev7
|
2024-05-07 02:20:14 +00:00 |
|
wucong
|
3fd657ad6e
|
!1262 统一 readme 格式(llama)
Merge pull request !1262 from wucong/dev6
|
2024-05-07 02:20:00 +00:00 |
|
wucong
|
dc6db1f858
|
!1260 统一 readme 格式(chatglm3 + intern)
Merge pull request !1260 from wucong/dev5
|
2024-05-07 02:19:45 +00:00 |
|
wucong
|
297fe8b01b
|
!1265 统一 readme 格式(llama3 + mixtral)
Merge pull request !1265 from wucong/dev9
|
2024-05-07 02:16:49 +00:00 |
|
guoxinjie
|
2ae8749f4a
|
!1252 统一 readme 格式(aquila)
Merge pull request !1252 from guoxinjie/readme
|
2024-04-30 07:50:27 +00:00 |
|
wucong
|
ae21a622b8
|
!1254 统一 readme 格式(baichuan2 + bloom)
Merge pull request !1254 from wucong/dev2
|
2024-04-30 07:39:33 +00:00 |
|
Liuchang
|
9a3d5641f5
|
!1255 优化聊天功能,增加Llama3聊天脚本和说明
Merge pull request !1255 from Liuchang/master
|
2024-04-30 02:58:39 +00:00 |
|
wucong
|
4e62972ecd
|
!1253 统一 readme 格式(baichuan)
Merge pull request !1253 from wucong/dev1
|
2024-04-29 03:52:08 +00:00 |
|
wwzhuo
|
b2915bd2ab
|
!1238 更新llama2 7b/13b 性能最优配置
Merge pull request !1238 from wwzhuo/master
|
2024-04-26 08:43:14 +00:00 |
|
Liuchang
|
a9f905b63f
|
!1251 Llama3 readme更新
Merge pull request !1251 from Liuchang/master
|
2024-04-26 07:27:08 +00:00 |
|
fengliangjun
|
791677c135
|
!1246 更新baichuan2-13B性能至1668
Merge pull request !1246 from fengliangjun/master
|
2024-04-26 01:47:52 +00:00 |
|
Liuchang
|
4109f95dfd
|
!1242 新增Llama3-8B和70B模型
Merge pull request !1242 from Liuchang/master
|
2024-04-25 01:24:31 +00:00 |
|
guhangsong
|
39d6fd7336
|
!1218 迁移megatron patch
Merge pull request !1218 from guhangsong/patch
|
2024-04-23 01:57:03 +00:00 |
|
fengliangjun
|
464131283f
|
!1239 去除FA适配时的一些冗余shape变换操作,提升性能
Merge pull request !1239 from fengliangjun/master
|
2024-04-18 01:42:50 +00:00 |
|
glhyy
|
75a81f58f9
|
!1233 README已知问题更新
Merge pull request !1233 from glhyy/master
|
2024-04-16 02:22:33 +00:00 |
|
LeiZhenzhen
|
5ad4ceddd4
|
!1231 对chatglm3增加partial_rope支持
Merge pull request !1231 from LeiZhenzhen/master
|
2024-04-15 13:11:56 +00:00 |
|
LeiZhenzhen
|
ab22271e13
|
!1227 新增chatglm3 预训练、推理、评估基线
Merge pull request !1227 from LeiZhenzhen/master
|
2024-04-11 03:23:33 +00:00 |
|
guoxinjie
|
2f32c76be2
|
!1224 移除 ModelLink 下的 megatron,并在 readme 中进行补充
Merge pull request !1224 from guoxinjie/remove_megatron
|
2024-04-09 07:44:00 +00:00 |
|
LeiZhenzhen
|
8524ea2735
|
!1225 增加chatglm3权重转换功能
Merge pull request !1225 from LeiZhenzhen/master
|
2024-04-09 06:05:25 +00:00 |
|
黄宇豪
|
e23e1e354b
|
!1215 fix: 统一Mixtral-README为预训练模板
Merge pull request !1215 from 黄宇豪/master
|
2024-04-03 02:08:55 +00:00 |
|
fengliangjun
|
0df09cd187
|
!1202 添加profiling功能
* add profiling
|
2024-03-30 08:58:41 +00:00 |
|
黄宇豪
|
62c39ddb9b
|
!1201 fix: 修复权重保存路径和数据集路径,格式化了README
Merge pull request !1201 from 黄宇豪/master
|
2024-03-30 06:32:18 +00:00 |
|
shishaoyu
|
ce01706c93
|
!1199 【DTS2024032814829】临时规避压测反复kill拉起情况下loss出现NaN的问题
Merge pull request !1199 from shishaoyu/master
|
2024-03-29 06:09:28 +00:00 |
|
黄宇豪
|
e8ae798db4
|
!1186 统一权重路径和README样式
Merge pull request !1186 from 黄宇豪/master
|
2024-03-28 03:43:17 +00:00 |
|
liuyanghan
|
aa6d2662cc
|
!1177 多机训练下,数据加载问题说明
Merge pull request !1177 from liuyanghan/master
|
2024-03-28 01:07:00 +00:00 |
|
wwzhuo
|
24c423201b
|
!1152 llama2 readme修改,更正tokenizer说明
* 修改llama2 readme中微调tokenizer变更说明
|
2024-03-27 08:25:03 +00:00 |
|