Zuletzt aktiv 1728786012

LLaMA-Factory-2.md Orginalformat
Approach Full-tuning Freeze-tuning LoRA QLoRA
Pre-Training
Supervised Fine-Tuning
Reward Modeling
PPO Training
DPO Training
KTO Training
ORPO Training
SimPO Training