[2408.10141] Instruction Finetuning for Leaderboard Generation from Empirical AI Research