[2403.16851] Can tweets predict article retractions? A comparison between human and LLM labelling