Son aktif 1728786012

Revizyon 5aaa9c49f4b3130408b5d86e4e5def4c91fb3588

LLaMA-Factory-2.md Ham
Approach Full-tuning Freeze-tuning LoRA QLoRA
Pre-Training
Supervised Fine-Tuning
Reward Modeling
PPO Training
DPO Training
KTO Training
ORPO Training
SimPO Training