BiBERT-AV: Enhancing Authorship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM | SpringerLink
Skip to main content

BiBERT-AV: Enhancing Authorship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM

  • Conference paper
  • First Online:
Ubiquitous Security (UbiSec 2023)

Abstract

Authorship verification is a challenging problem in natural language processing. It is crucial in security and forensics, helping identify authors and combat fake news. Recent advancements in neural network models have shown promising results in improving the accuracy of authorship verification. This paper presents a novel model for authorship verification using Siamese networks and evaluates the advantages of transformer-based models over existing methods that rely on domain knowledge and feature engineering. This paper’s objective is to address the authorship verification problem in NLP which entails determining whether two texts were written by the same author by introducing a novel approach that employs Siamese networks with pre-trained BERT and Bi-LSTM layers. The proposed model BiBERT-AV aims to compare the performance of this Siamese network using pre-trained BERT and Bi-LSTM layers against existing methods for authorship verification. The results of this study demonstrate that the proposed Siamese network model BiBERT-AV offers an effective solution for authorship verification that is based solely on the writing style of the author, which outperformed the baselines and state-of-the-art methods. Additionally, our model offers a viable alternative to existing methods that heavily rely on domain knowledge and laborious feature engineering, which often demand significant time and expertise. Notably, the BiBERT-AV model consistently achieves a notable level of accuracy, even when the number of authors is expanded to a larger group. This achievement underscores a notable contrast to the limitations exhibited by the baseline model used in exacting research studies. Overall, this study provides valuable insights into the application of Siamese networks with pre-trained BERT and Bi-LSTM layers for authorship verification and establishes the superiority of the proposed models over existing methods in this domain. The study contributes to the advancement of NLP research and has implications for several real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9723
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12154
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brocardo, M.L., Traore, I., Saad, S., Woungang, I.: Authorship verification for short messages using stylometry. In: 2013 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–6. IEEE (2013)

    Google Scholar 

  2. Loomba, S., de Figueiredo, A., Piatek, S.J., de Graaf, K., Larson, H.J.: Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nat. Hum. Behav. 5(3), 337–348 (2021)

    Article  Google Scholar 

  3. Bagnall, D.: Author identification using multi-headed recurrent neural networks. arXiv preprint arXiv:1506.04891 (2015)

  4. Araujo-Pino, E., Gómez-Adorno, H., Pineda, G.F.: Siamese network applied to authorship verification. In: CLEF (Working Notes). Working Notes proceedings in CLEF 2020 (2020)

    Google Scholar 

  5. Futrzynski, R.: Author classification as pre-training for pairwise authorship verification. In: CLEF (Working Notes), pp. 1945–1952 (2021)

    Google Scholar 

  6. Tyo, J., Dhingra, B., Lipton, Z.C.: Siamese BERT for authorship verification. In: CLEF (Working Notes), pp. 2169–2177 (2021)

    Google Scholar 

  7. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  8. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  10. Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019)

    Article  MathSciNet  Google Scholar 

  11. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  12. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6 (1993)

    Google Scholar 

  13. Chicco, D.: Siamese neural networks: an overview. Artif. Neural Netw. 73–94 (2021)

    Google Scholar 

  14. Brocardo, M.L., Traore, I., Woungang, I., Obaidat, M.S.: Authorship verification using deep belief network systems. Int. J. Commun. Syst. 30(12), e3259 (2017)

    Article  Google Scholar 

  15. Halvani, O., Graner, L., Regev, R.: TAVeer: an interpretable topic-agnostic authorship verification method. In: Proceedings of the 15th International Conference on Availability, Reliability and Security, pp. 1–10 (2020)

    Google Scholar 

  16. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates Inc. (2019)

    Google Scholar 

  17. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  Google Scholar 

  18. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics (2020)

    Google Scholar 

  19. dataset Enron 2015. Enron email dataset (2015). Accessed 23 June 2023

    Google Scholar 

  20. Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026 (2013)

  21. Graves, A., Jaitly, N., Mohamed, A.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 273–278. IEEE (2013)

    Google Scholar 

  22. Almutairi, A., Kang, B., Fadhel, N.: The effectiveness of transformer-based models for BEC attack detection. In: Li, S., Manulis, M., Miyaji, A. (eds.) NSS 2023. LNCS, vol. 13983, pp. 77–90. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-39828-5_5

    Chapter  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Deanship of Scientific Research at Shaqra University and the Saudi Arabian Cultural Bureau in London (SACB) for allowing the research to be undertaken.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amirah Almutairi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Almutairi, A., Kang, B., Nawfal Al Hashimy (2024). BiBERT-AV: Enhancing Authorship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM. In: Wang, G., Wang, H., Min, G., Georgalas, N., Meng, W. (eds) Ubiquitous Security. UbiSec 2023. Communications in Computer and Information Science, vol 2034. Springer, Singapore. https://doi.org/10.1007/978-981-97-1274-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-1274-8_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-1273-1

  • Online ISBN: 978-981-97-1274-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics