我们提供了多样化的大模型微调示例脚本。请确保在LLaMA-Factory目录下执行下述命令。目录LoRA 微调QLoRA 微调全参数微调合并 LoRA 适配器与模型量化推理 LoRA 模型杂项使用CUDA_VISIBLE_DEVICESGPU或ASCEND_RT_VISIBLE_DEVICESNPU选择计算设备。LLaMA-Factory 默认使用所有可见的计算设备。基础用法llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml高级用法CUDA_VISIBLE_DEVICES0,1 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml \ learning_rate1e-5 \ logging_steps1bash examples/train_lora/qwen3_lora_sft.sh示例LoRA 微调增量预训练llamafactory-cli train examples/train_lora/qwen3_lora_pretrain.yaml指令监督微调llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml多模态指令监督微调llamafactory-cli train examples/train_lora/qwen3vl_lora_sft.yamlDPO/ORPO/SimPO 训练llamafactory-cli train examples/train_lora/qwen3_lora_dpo.yaml多模态 DPO/ORPO/SimPO 训练llamafactory-cli train examples/train_lora/qwen3vl_lora_dpo.yaml奖励模型训练llamafactory-cli train examples/train_lora/qwen3_lora_reward.yamlKTO 训练llamafactory-cli train examples/train_lora/qwen3_lora_kto.yaml预处理数据集对于大数据集有帮助在配置中使用tokenized_path以加载预处理后的数据集。llamafactory-cli train examples/train_lora/qwen3_preprocess.yaml多机指令监督微调FORCE_TORCHRUN1 NNODES2 NODE_RANK0 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml FORCE_TORCHRUN1 NNODES2 NODE_RANK1 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml支持弹性和容错的多机指令监督微调要启动一个支持弹性节点和容错的多机指令微调在每个节点上执行以下命令。弹性节点数量范围为MIN_NNODES:MAX_NNODES每个节点最多允许因为错误重启MAX_RESTARTS次。RDZV_ID应设置为一个唯一的作业 ID由参与该作业的所有节点共享。更多新可以参考官方文档 torchrun。FORCE_TORCHRUN1 MIN_NNODES1 MAX_NNODES3 MAX_RESTARTS3 RDZV_IDllamafactory MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml使用 DeepSpeed ZeRO-3 平均分配显存FORCE_TORCHRUN1 llamafactory-cli train examples/train_lora/qwen3_lora_sft_ds3.yaml使用 Ray 在 4 张 GPU 上微调USE_RAY1 llamafactory-cli train examples/train_lora/qwen3_lora_sft_ray.yamlQLoRA 微调基于 4/8 比特 Bitsandbytes/HQQ/EETQ 量化进行指令监督微调推荐llamafactory-cli train examples/train_qlora/qwen3_lora_sft_otfq.yaml在 NPU 上基于 4 比特 Bitsandbytes 量化进行指令监督微调llamafactory-cli train examples/train_qlora/qwen3_lora_sft_bnb_npu.yaml基于 4/8 比特 GPTQ 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_gptq.yaml基于 4 比特 AWQ 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_awq.yaml基于 2 比特 AQLM 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_aqlm.yaml全参数微调在单机上进行指令监督微调FORCE_TORCHRUN1 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml在多机上进行指令监督微调FORCE_TORCHRUN1 NNODES2 NODE_RANK0 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml FORCE_TORCHRUN1 NNODES2 NODE_RANK1 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml多模态指令监督微调FORCE_TORCHRUN1 llamafactory-cli train examples/train_full/qwen3vl_full_sft.yaml合并 LoRA 适配器与模型量化合并 LoRA 适配器注请勿使用量化后的模型或quantization_bit参数来合并 LoRA 适配器。llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml使用 AutoGPTQ 量化模型llamafactory-cli export examples/merge_lora/qwen3_gptq.yaml保存 Ollama 配置文件llamafactory-cli export examples/merge_lora/qwen3_full_sft.yaml推理 LoRA 模型使用 vLLM 多卡推理评估python scripts/vllm_infer.py --model_name_or_path Qwen/Qwen3-4B-Instruct-2507 --template qwen3_nothink --dataset alpaca_en_demo python scripts/eval_bleu_rouge.py generated_predictions.jsonl使用命令行对话框llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml使用浏览器对话框llamafactory-cli webchat examples/inference/qwen3_lora_sft.yaml启动 OpenAI 风格 APIllamafactory-cli api examples/inference/qwen3_lora_sft.yaml杂项使用 GaLore 进行全参数训练llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml使用 APOLLO 进行全参数训练llamafactory-cli train examples/extras/apollo/llama3_full_sft.yaml使用 BAdam 进行全参数训练llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml使用 Adam-mini 进行全参数训练llamafactory-cli train examples/extras/adam_mini/qwen2_full_sft.yaml使用 Muon 进行全参数训练llamafactory-cli train examples/extras/muon/qwen2_full_sft.yamlLoRA 微调llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yamlPiSSA 微调llamafactory-cli train examples/extras/pissa/llama3_lora_sft.yaml深度混合微调llamafactory-cli train examples/extras/mod/llama3_full_sft.yamlLLaMA-Pro 微调bash examples/extras/llama_pro/expand.sh llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yamlFSDPQLoRA 微调bash examples/extras/fsdp_qlora/train.shOFT 微调llamafactory-cli train examples/extras/oft/llama3_oft_sft.yamlQOFT 微调llamafactory-cli train examples/extras/qoft/llama3_oft_sft_bnb_npu.yaml
LlamaFactory的微调指令
发布时间:2026/5/24 13:20:50
我们提供了多样化的大模型微调示例脚本。请确保在LLaMA-Factory目录下执行下述命令。目录LoRA 微调QLoRA 微调全参数微调合并 LoRA 适配器与模型量化推理 LoRA 模型杂项使用CUDA_VISIBLE_DEVICESGPU或ASCEND_RT_VISIBLE_DEVICESNPU选择计算设备。LLaMA-Factory 默认使用所有可见的计算设备。基础用法llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml高级用法CUDA_VISIBLE_DEVICES0,1 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml \ learning_rate1e-5 \ logging_steps1bash examples/train_lora/qwen3_lora_sft.sh示例LoRA 微调增量预训练llamafactory-cli train examples/train_lora/qwen3_lora_pretrain.yaml指令监督微调llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml多模态指令监督微调llamafactory-cli train examples/train_lora/qwen3vl_lora_sft.yamlDPO/ORPO/SimPO 训练llamafactory-cli train examples/train_lora/qwen3_lora_dpo.yaml多模态 DPO/ORPO/SimPO 训练llamafactory-cli train examples/train_lora/qwen3vl_lora_dpo.yaml奖励模型训练llamafactory-cli train examples/train_lora/qwen3_lora_reward.yamlKTO 训练llamafactory-cli train examples/train_lora/qwen3_lora_kto.yaml预处理数据集对于大数据集有帮助在配置中使用tokenized_path以加载预处理后的数据集。llamafactory-cli train examples/train_lora/qwen3_preprocess.yaml多机指令监督微调FORCE_TORCHRUN1 NNODES2 NODE_RANK0 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml FORCE_TORCHRUN1 NNODES2 NODE_RANK1 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml支持弹性和容错的多机指令监督微调要启动一个支持弹性节点和容错的多机指令微调在每个节点上执行以下命令。弹性节点数量范围为MIN_NNODES:MAX_NNODES每个节点最多允许因为错误重启MAX_RESTARTS次。RDZV_ID应设置为一个唯一的作业 ID由参与该作业的所有节点共享。更多新可以参考官方文档 torchrun。FORCE_TORCHRUN1 MIN_NNODES1 MAX_NNODES3 MAX_RESTARTS3 RDZV_IDllamafactory MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml使用 DeepSpeed ZeRO-3 平均分配显存FORCE_TORCHRUN1 llamafactory-cli train examples/train_lora/qwen3_lora_sft_ds3.yaml使用 Ray 在 4 张 GPU 上微调USE_RAY1 llamafactory-cli train examples/train_lora/qwen3_lora_sft_ray.yamlQLoRA 微调基于 4/8 比特 Bitsandbytes/HQQ/EETQ 量化进行指令监督微调推荐llamafactory-cli train examples/train_qlora/qwen3_lora_sft_otfq.yaml在 NPU 上基于 4 比特 Bitsandbytes 量化进行指令监督微调llamafactory-cli train examples/train_qlora/qwen3_lora_sft_bnb_npu.yaml基于 4/8 比特 GPTQ 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_gptq.yaml基于 4 比特 AWQ 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_awq.yaml基于 2 比特 AQLM 量化进行指令监督微调llamafactory-cli train examples/train_qlora/llama3_lora_sft_aqlm.yaml全参数微调在单机上进行指令监督微调FORCE_TORCHRUN1 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml在多机上进行指令监督微调FORCE_TORCHRUN1 NNODES2 NODE_RANK0 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml FORCE_TORCHRUN1 NNODES2 NODE_RANK1 MASTER_ADDR192.168.0.1 MASTER_PORT29500 llamafactory-cli train examples/train_full/qwen3_full_sft.yaml多模态指令监督微调FORCE_TORCHRUN1 llamafactory-cli train examples/train_full/qwen3vl_full_sft.yaml合并 LoRA 适配器与模型量化合并 LoRA 适配器注请勿使用量化后的模型或quantization_bit参数来合并 LoRA 适配器。llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml使用 AutoGPTQ 量化模型llamafactory-cli export examples/merge_lora/qwen3_gptq.yaml保存 Ollama 配置文件llamafactory-cli export examples/merge_lora/qwen3_full_sft.yaml推理 LoRA 模型使用 vLLM 多卡推理评估python scripts/vllm_infer.py --model_name_or_path Qwen/Qwen3-4B-Instruct-2507 --template qwen3_nothink --dataset alpaca_en_demo python scripts/eval_bleu_rouge.py generated_predictions.jsonl使用命令行对话框llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml使用浏览器对话框llamafactory-cli webchat examples/inference/qwen3_lora_sft.yaml启动 OpenAI 风格 APIllamafactory-cli api examples/inference/qwen3_lora_sft.yaml杂项使用 GaLore 进行全参数训练llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml使用 APOLLO 进行全参数训练llamafactory-cli train examples/extras/apollo/llama3_full_sft.yaml使用 BAdam 进行全参数训练llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml使用 Adam-mini 进行全参数训练llamafactory-cli train examples/extras/adam_mini/qwen2_full_sft.yaml使用 Muon 进行全参数训练llamafactory-cli train examples/extras/muon/qwen2_full_sft.yamlLoRA 微调llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yamlPiSSA 微调llamafactory-cli train examples/extras/pissa/llama3_lora_sft.yaml深度混合微调llamafactory-cli train examples/extras/mod/llama3_full_sft.yamlLLaMA-Pro 微调bash examples/extras/llama_pro/expand.sh llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yamlFSDPQLoRA 微调bash examples/extras/fsdp_qlora/train.shOFT 微调llamafactory-cli train examples/extras/oft/llama3_oft_sft.yamlQOFT 微调llamafactory-cli train examples/extras/qoft/llama3_oft_sft_bnb_npu.yaml