The kyungeun/gemma-2-9b-it-mathinstruct-dpo was trained using UltraFeedback dataset on kyungeun/gemma-2-9b-it-mathinstruct which is trained on mathinstruct dataset.