Abstract
Prediction of movie genres is an intriguing problem that has several applications in designing recommendation systems for the audiences, analyzing movie box office performance and understanding the theme of the movie to list some. This is a classic multi-label classification problem. An algorithm for movie genre detection has been proposed built on the yet unused movie’s subtitles which are a documented account of the movie’s visual content and dialogues. The basic idea is to identify words that have high frequency in a particular genre and use them as features for training the classification machine learning models. The performance of the algorithm was tested on English subtitles of 964 movies of six genres: Action, Fantasy, Horror, Romance, Sports and War. Experiments were conducted with varied number of features and six machine learning models. The best result was obtained using K-Nearest Neighbour (kNN) with the average precision for all genres being 77.7% with 200 features. Another noteworthy result was an average precision of 75.2% using kNN with merely 50 features. The algorithm performed very well for the genres: Sports and War with above 90% precision in some cases.
Similar content being viewed by others
Code Availability
The code can be shared if required.
References
Austin A, Moore E, Gupta U, Chordia P (2010) Characterization of movie genre based on music score. In: 2010 IEEE international conference on acoustics, speech and signal processing, IEEE, pp 421–424
Bhatt RB (2009) Neuro-fuzzy decision trees for content popularity model and multi-genre movie recommendation system over social network. In: TENCON 2009-2009 IEEE region 10 conference, IEEE, pp 1–6
Choi SM, Ko SK, Han YS (2012) A movie recommendation algorithm based on genre correlations. Expert Syst Appl 39(9):8079–8085
Chu WT, Guo HJ (2017) Movie genre classification based on poster images with deep neural networks. In: Proceedings of the workshop on multimodal understanding of social, Affective and Subjective Attributes, ACM, pp 39–45
Ding D, Yang J, Li Q, Wang L, Wenyin L (2004) Automatic detection of flash movie genre using bayesian approach. In: 2004 IEEE International conference on multimedia and expo (ICME)(IEEE cat. no. 04TH8763), vol 1. IEEE, pp 603–606
Doshi P, Zadrozny W (2018) Movie genre detection using topological data analysis. In: International conference on statistical language and speech processing, Springer, pp 117–128
Doudpota SM, Guha S, Baber J (2013) Mining movies for song sequences with video based music genre identification system. Inform Process Manage 49 (2):529–544
Fan J, Zhou N, Peng J, Gao L (2015) Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans Image Process 24 (11):4172–4184
Han Y, Kim Y (2017) An extracting method of movie genre similarity using aspect-based approach in social media. ACM SIGAPP Applied Computing Review 17(2):36–45
Haq IU, Muhammad K, Ullah A, Baik SW (2019) Deepstar: Detecting starring characters in movies. IEEE Access 7:9265–9272
Hong HZ, Hwang JIG (2015) Multimodal plsa for movie genre classification. In: International workshop on multiple classifier systems, Springer, pp 159–167
Huang YF, Wang SH (2012) Movie genre classification using svm with audio and video features. In: International conference on active media technology, Springer, pp 1–10
Hwang TG, Park CS, Hong JH, Kim SK (2016) An algorithm for movie classification and recommendation using genre correlation. Multimed Tools Appl 75(20):12843–12858
Irie G, Satou T, Kojima A, Yamasaki T, Aizawa K (2010) Affective audio-visual words and latent topic driving model for realizing movie affective scene classification. IEEE Transactions on Multimedia 12(6):523–535
Ivasic-Kos M, Miran P, Luka M (2014) Movie posters classification into genres based on low-level features. In: 2014 37th international convention on information and communication technology, electronics and microelectronics (MIPRO),IEEE, pp 1198–1203
Kaimann D (2013) ’to infinity and beyond!’-a genre-specific film analysis of movie success mechanisms. Center for International Economics Working Paper Series (2011-05)
Kim KR, Moon N (2012) Recommender system design using movie genre similarity and preferred genres in smartphone. Multimed Tools Appl 61(1):87–104
Païs G, Lambert P, Beauchêne D, Deloule F, Ionescu B (2012) Animated movie genre detection using symbolic fusion of text and image descriptors. In: 2012 10th international workshop on content-based multimedia indexing (CBMI), IEEE, pp 1–6
Rasheed Z, Shah M (2002) Movie genre classification by exploiting audio-visual features of previews. In: Object recognition supported by user interaction for service robots, vol 2. IEEE, pp 1086–1089
Saumya S, Kumar J, Singh JP (2018) Genre fraction detection of a movie using text mining. In: Advanced Computing and Systems for Security, Springer, pp 167–177
ScikitLearn (Accessed: 2020) SelectKBest
Shon JH, Kim YG, Yim SJ (2012) Dissecting movie genres from an audience perspective: Mti movie classification method
Sirattanajakarin S, Thusaranon P (2019) Movie genre in multi-label classification using semantic extraction from only movie poster. In: Proceedings of the 2019 7th international conference on computer and communications management, ACM, pp 23–27
Ul Haq I, Ullah A, Muhammad K, Lee MY, Baik SW (2019) Personalized movie summarization using deep cnn-assisted facial expression recognition. Complexity 2019
Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans Cybern 50 (7):3330–3342
Wehrmann J, Barros RC (2017) Movie genre classification: A multi-label approach based on convolutions through time. Appl Soft Comput 61:973–982
Xu B, Fu Y, Jiang YG, Li B, Sigal L (2016) Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans Affect Comput 9(2):255–270
Zhou H, Hermans T, Karandikar AV, Rehg JM (2010) Movie genre classification via scene categorization. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp 747–750
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Availability of data and material
All data has been taken from yifysubtitles.com.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Rajput, N.K., Grover, B.A. A multi-label movie genre classification scheme based on the movie’s subtitles. Multimed Tools Appl 81, 32469–32490 (2022). https://doi.org/10.1007/s11042-022-12961-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12961-6