Mitigation
AML.M0022Generative AI Model Alignment
What it is
When training or fine-tuning a generative AI model it is important to utilize techniques that improve model alignment with safety, security, and content policies.
The fine-tuning process can potentially remove built-in safety mechanisms in a generative AI model, but utilizing techniques such as Supervised Fine-Tuning, Reinforcement Learning from Human Feedback or AI Feedback, and Targeted Safety Context Distillation can improve the safety and alignment of the model.
References
Related by meaning· 6
Nearest entities by semantic similarity across the cs-graph corpus.