Preventing RNN from Using Sequence Length as a Feature

Baillargeon, Jean-Thomas; Cossette, Hélène; Lamontagne, Luc

Computer Science > Machine Learning

arXiv:2212.08276 (cs)

[Submitted on 16 Dec 2022]

Title:Preventing RNN from Using Sequence Length as a Feature

Authors:Jean-Thomas Baillargeon, Hélène Cossette, Luc Lamontagne

View PDF

Abstract:Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.

Comments:	6 pages, but my overleaf generrates 5 pages. I have no error, the font size seems different
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2212.08276 [cs.LG]
	(or arXiv:2212.08276v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2212.08276

Submission history

From: Jean-Thomas Baillargeon [view email]
[v1] Fri, 16 Dec 2022 04:23:36 UTC (1,516 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2022-12

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Preventing RNN from Using Sequence Length as a Feature

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Preventing RNN from Using Sequence Length as a Feature

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators