ModelLink2/examples/mcore
wucong b7dce4d1e0 !1917 添加分支与标签说明
Merge pull request !1917 from wucong/addBranchReadme
2024-11-25 10:37:48 +00:00
..
baichuan2 !1893 optim: sft微调性能提升 2024-11-15 04:34:08 +00:00
chatglm3 !1893 optim: sft微调性能提升 2024-11-15 04:34:08 +00:00
codellama !1814 refactor trainer 2024-11-06 10:53:02 +00:00
deepseek2 !1844 [mcore-llm]MLA结构及group_limited_greedy适配CP标准流程 2024-11-05 07:18:18 +00:00
deepseek2_coder !1511 refactor: support Deepseek Specification 2024-10-21 07:57:37 +00:00
deepseek2_lite !1814 refactor trainer 2024-11-06 10:53:02 +00:00
gemma !1745 新增baichuan2全参微调脚本和相应模版 2024-11-04 13:03:12 +00:00
gemma2 !1814 refactor trainer 2024-11-06 10:53:02 +00:00
glm4 !1814 refactor trainer 2024-11-06 10:53:02 +00:00
gpt4 !1601 新增gpt4 moe dropless 2024-09-04 01:13:13 +00:00
grok1 !1619 新增glm4模型适配 2024-09-05 13:05:30 +00:00
internlm2 !1917 添加分支与标签说明 2024-11-25 10:37:48 +00:00
internlm25 !1915 InternLM预训练脚本问题修改、用户指南问题修改 2024-11-22 06:29:09 +00:00
llama2 !1911 增加llama2、mixtral的lora权重转换脚本 2024-11-25 01:23:41 +00:00
llama3 !1858 dpo、simpo方案特性支持:支持vpp、dpp、ep、cp、断点续训等 2024-11-21 03:31:39 +00:00
llama31 !1858 dpo、simpo方案特性支持:支持vpp、dpp、ep、cp、断点续训等 2024-11-21 03:31:39 +00:00
llama32 !1744 新增llama3.2-1b模型适配 2024-10-08 06:40:24 +00:00
minicpm !1814 refactor trainer 2024-11-06 10:53:02 +00:00
mistral !1814 refactor trainer 2024-11-06 10:53:02 +00:00
mixtral !1911 增加llama2、mixtral的lora权重转换脚本 2024-11-25 01:23:41 +00:00
qwen2 !1904 添加Qwen2.5-72B模型 2024-11-18 08:59:00 +00:00
qwen2_moe !1707 添加新模型Qwen2-57B-A14B 2024-09-24 14:40:52 +00:00
qwen15 !1806 Optim: llama3 qwen系列模型 预训练性能提升 2024-11-18 08:29:28 +00:00
qwen25 !1904 添加Qwen2.5-72B模型 2024-11-18 08:59:00 +00:00
qwen25_coder !1830 Qwen2.5代码大模型适配 2024-11-19 03:42:05 +00:00
qwen25_math !1913 Qwen2.5-Math系列模型适配 2024-11-22 06:13:09 +00:00
yi !1676 Legacy模型Qwen1.5-32B适配mcore 2024-09-14 01:01:27 +00:00
yi15 !1916 yi1.5-6b最佳性能更新 2024-11-22 02:39:13 +00:00