Automatic Assessment of Absolute Sentence Complexity

Automatic Assessment of Absolute Sentence Complexity

Sanja Stajner, Simone Paolo Ponzetto, Heiner Stuckenschmidt

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 4096-4102. https://doi.org/10.24963/ijcai.2017/572

Lexically and syntactically simpler sentences result in shorter reading time and better understanding in many people. However, no reliable systems for automatic assessment of absolute sentence complexity have been proposed so far. Instead, the assessment is usually done manually, requiring expert human annotators. To address this problem, we first define the sentence complexity assessment as a five-level classification task, and build a ‘gold standard’ dataset. Next, we propose robust systems for sentence complexity assessment, using a novel set of features based on leveraging lexical properties of freely available corpora, and investigate the impact of the feature type and corpus size on the classification performance.
Keywords:
Natural Language Processing: Resources and Evaluation
Natural Language Processing: Text Classification