Última actividad 1728786012

Revisión 5aaa9c49f4b3130408b5d86e4e5def4c91fb3588

LLaMA-Factory-2.md Sin formato
Approach Full-tuning Freeze-tuning LoRA QLoRA
Pre-Training
Supervised Fine-Tuning
Reward Modeling
PPO Training
DPO Training
KTO Training
ORPO Training
SimPO Training