Explainable recommendations with nonnegative matrix factorization

Zhang, Xiaoxia; Zhou, Xianjun; Chen, Lu; Liu, Yanjun

doi:10.1007/s10462-023-10619-9

Explainable recommendations with nonnegative matrix factorization

Published: 31 October 2023

Volume 56, pages 3927–3955, (2023)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Xiaoxia Zhang^1,2,
Xianjun Zhou^1,2^na1,
Lu Chen^1,2^na1 &
…
Yanjun Liu³^na1

630 Accesses
Explore all metrics

Abstract

Explicable recommendation system is proved to be conducive to improving the persuasiveness of the recommendation system, enabling users to trust the system more and make more intelligent decisions. Nonnegative Matrix Factorization (NMF) produces interpretable solutions for many applications including collaborative filtering as it’s nonnegativity. However, the latent features make it difficult to interpret recommendation results to users because we don’t know the specific meaning of features that users are interested in and the extent to which the items or users belong to these features. To overcome this difficulty, we develop a novel method called Partially Explainable Nonnegative Matrix Factorization (PE-NMF) by employing explicit data to replace part latent variables of item-feature matrix, by which users can learn more about the features of the items and then to make ideal decisions and recommendations. The objective function of PE-NMF is composed of two parts: one part corresponding to explicit features and the other part is about implicit features. We develop an iterative method to minimize the objective function and derive the iterative update rules, with which the objective function can be proved to be decreasing. Finally, the experiments are executed on Yelp, Amazon and Dianping datasets, and the experimental results demonstrate PE-NMF keeps a high prediction performance on both rating prediction and top-N recommendation that compare to fully explainable nonnegative matrix factorization (FE-NMF), which is obtained by using explicit opinions instead of item-feature matrix. Also PE-NMF holds almost the same recommendation ability as NMF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

High Dimensional Explicit Feature Biased Matrix Factorization Recommendation

Integrating user-side information into matrix factorization to address data sparsity of collaborative filtering

Article 18 February 2024

Explainable recommendation with fusion of aspect information

Article 13 April 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Amazon data set can be found at http://jmcauley.ucsd.edu/data/amazon/.
Yelp data set can be found at https://www.yelp.com/dataset.
Dianping data set can be found at https://doi.org/10.18170/DVN/GCIUN4.

References

Abdollahi B, Nasraoui O (2016) Explainable matrix factorization for collaborative filtering, Proceedings of the 25th international conference companion on world wide web. International world wide web conferences steering committee. WWW’16 Companion, April 11–15, Montreal, Quebec, Canada
Abdollahi B, Nasraoui O (2017) Using explainability for constrained matrix factorization. In: Proceedings of the eleventh ACM conference on recommender systems, pp 79–83
Adeel A, Khalid S, Osman K (2021) On deep neural network for trust aware cross domain recommendations in E-commerce. Expert Syst Appl 174:114757
Article Google Scholar
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17:734–749
Article Google Scholar
Aghdam MH (2022) A novel constrained non-negative matrix factorization method based on users and items pairwise relationship for recommender systems. Expert Syst Appl 195:116593
Article Google Scholar
Behera G, Nain N (2022) DeepNNMF: deep nonlinear non-negative matrix factorization to address sparsity problem of collaborative recommender system. Int J Inf Technol 14:3637–3645
Google Scholar
Belkin M, Niyogi P, Sindhwani V (2004) Manifold regularization: a geometric framework for learning from examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Bobadilla J, Serradilla F, Hernando A (2009) Collaborative filtering adapted to recommender systems of e-learning. Knowl-Based Syst 22:261–265
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the Em algorithm. J R Stat Soc 39:1–38
MathSciNet MATH Google Scholar
Fatemeh R, Chitra D (2021) A survey of attack detection approaches in collaborative filtering recommender systems. Artif Intell Rev 54:2011–2066
Article Google Scholar
Hernando A, Bobadilla J, Ortega F (2016) A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model. Knowl-Based Syst 97:188–202
Article Google Scholar
Hongyan C (2020) Personalized recommendation of film and television culture based on an intelligent classification algorithm. Pers Ubiquitous Comput 24:165–176
Article Google Scholar
Hua LP, Jing BW, Zhi JZ (2021) A movie recommendation model combining time information and probability matrix factorisation. Int J Embed Syst 14:239–247
Article Google Scholar
Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on Recommender systems, pp 135–142
Ji Z, Pi H, Wei W (2019) Recommendation based on review texts and social communities: a hybrid model. IEEE Access 103:40416–40427
Article Google Scholar
Jiang M, Cui PR (2012) Social contextual recommendation. In: 21st ACM international conference on information and knowledge management, pp 45–54
Khan Z, Iltaf N, Afzal H, Abbas H (2020) Enriching non-negative matrix factorization with contextual embeddings for recommender systems. Neurocomputing 380:246–258
Article Google Scholar
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42:30–37
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791
Article MATH Google Scholar
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. Nips 13:556–562
Google Scholar
Lee SK, Cho YH, Kim SH (2020) Collaborative filtering with ordinal scale-based implicit ratings for mobile music recommendations. Inf Sci 180:2142–2155
Article Google Scholar
Li L, Zhang YJ (2009) FastNMF: highly efficient monotonic fixed-point nonnegative matrix factorization algorithm with good applicability. J Electron Imaging 18:273–288
Article Google Scholar
Li H, Li K, An J et al (2019) An efficient manifold regularized sparse non-negative matrix factorization model for large-scale recommender systems on GPUs. Inf Sci 496:464–484
Article Google Scholar
Lu Y, Castellanos M, Dayal U, Zhai CX (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: International conference on world wide web, pp 347–356
Luo X, Zhou M, Xia Y, Zhu Q (2014) An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems. IEEE Trans Ind Inf 10:1273–1284
Article Google Scholar
Mao Y, Lawrence KS (2004) Modeling distances in large-scale networks by matrix factorization. In: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pp 278–287
Nuez-Valdaz ER, Cueva-Lovelle JM, Sanjuán-Martínez Q (2012) Implicit feedback techniques on recommender systems applied to electronic books. Comput Hum Behav 28:1186–1193
Article Google Scholar
Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126
Article Google Scholar
Peng S, Ser W, Chen B, Sun L, Lin Z (2020) Robust nonnegative matrix factorization with local coordinate constraint for image clustering. Eng Appl Artif Intell 88:1–12
Article Google Scholar
Qian XM, Feng H, Zhao GS, Mei T (2014) Personalized recommendation combining user interest and social circle. IEEE Trans Knowl Data Eng 26:1763–1777
Article Google Scholar
Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. Adv Neural Inf Process Syst 20:1257–1264
Google Scholar
Saul L, Pereira F (1997) Aggregate and mixed-order Markov models for statistical language processing. In: Proceedings of the second conference on empirical methods in natural language processing, ACL Press, pp 81–89
Song W, Li X (2019) A non-negative matrix factorization for recommender systems based on dynamic bias. In: International conference on modeling decisions for artificial intelligence, pp 151–163
Tao YY, Jia YL, Wang N (2019) The FacT: taming latent factor models for explainability with factorization trees. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 295–304
Vaccari I, Carlevaro A, Narteni S, Cambiaso E, Mongelli M (2022) eXplainable and reliable against adversarial machine learning in data analytics. Proc IEEE Access 10:83949–83970
Article Google Scholar
Vig J, Sen S, Riedl J (2009) Tagsplanations: explaining recommendations using tags. In: Proceedings of the 14th international conference on intelligent user interfaces, pp 47–56
Wang CP, Song XL, Zhang JS (2018a) Graph regularized nonnegative matrix factorization with sample diversity for image representation. Eng Appl Artif Intell 68:32–39
Article Google Scholar
Wang N, Wang H, Jia Y, Yin Y (2018b) Explainable recommendation via multi-task learning in opinionated text data. In: The 41st international ACM SIGIR conference on research & development in information retrieval, ACM, pp 165–174
Wang J, Zhu L, Dai T, Xu QN, Gao TY (2021) Low-rank and sparse matrix factorization with prior relations for recommender systems. Appl Intell 51:3435–3449
Article Google Scholar
Wu Q, Tan M, Li X, Min H, Sun N (2015) NMFE-SSCC: non-negative matrix factorization ensemble for semi-supervised collective classification. Knowl-Based Syst 89:160–172
Article Google Scholar
Yang XW, Steck H, Liu Y (2012) Circle-based recommendation in online social networks. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1267–1275
Zhang Y, Lai G, Zhang M, Zhang Y, Liu Y, Ma S (2014) Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp 83–92
Zhang H, Ganchev I, Nikolov NS (2020) Featuremf: an item feature enriched matrix factorization model for item recommendation. IEEE Access 23:1–11
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61936001, 62136002, 62233018, 62221005, 12201089),the Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002, cstc2020jcyj-msxmX0737, cstc2021ycjh-bgzxm0013, cstb2022nscq-msx0226), the Key Cooperation Project of Chongqing Municipal Education Commission (HZ2021008), the Science and Technology Research Program of Chongqing Education Commission of China (KJQN201900638, KJQN202200513).

Author information

Xianjun Zhou, Lu Chen and Yanjun Liu have contributed equally to this work.

Authors and Affiliations

Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Xiaoxia Zhang, Xianjun Zhou & Lu Chen
Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
Xiaoxia Zhang, Xianjun Zhou & Lu Chen
Chongqing Normal University, School of Mathematical Sciences, Chongqing, 401331, China
Yanjun Liu

Authors

Xiaoxia Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Xianjun Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Lu Chen
View author publications
You can also search for this author inPubMed Google Scholar
Yanjun Liu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yanjun Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The objective function $O_{EF}$ in Eq. (3) is established using F-norm formulation, the proof of Theorem 1 is referred to Lee and Seung (1999). Actually, $O_{EF}$ is obtained by fixing ${\textbf {U}}$ in $O_{1}$, in this case, $O_{EF}$ is a linear function that only relates to ${\textbf {U}}$. Therefore, we can minimize $O_{EF}$ by taking the derivative with respect to ${\textbf {U}}$.

Proof of Theorem 1

As

$$\begin{aligned} O_{FE}&=\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}-2r){\mathcal {P}}_{\Omega ^{T}} ({\textbf {X}}^{T})+2r\text{Tr}({\textbf {I}}{\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{F} {\textbf {U}}^{T}))+2r^{2}\text{Tr}({\textbf {I}}{} {\textbf {I}}^{T}) \\ {}&-2\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{F}{} {\textbf {U}}^{T}))+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}} {\textbf {V}}_{F}^{T}){\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{F}{} {\textbf {U}}^{T})), \end{aligned}$$

Let $\psi _{ik}$ be the Lagrange multiplier for constraints $u_{ik}\ge 0$, and $\Psi =[\psi _{ik}].$ Then Lagrange function ${\mathcal {L}}_{O_{EF}}$ is

$$\begin{aligned} {\mathcal {L}}_{O_{EF}}&=\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}-2r){\mathcal {P}}_{\Omega ^{T}} ({\textbf {X}}^{T})+2r\text{Tr}({\textbf {I}}{\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{F}{} {\textbf {U}}^{T})) +2r^{2}\text{Tr}({\textbf {I}}{} {\textbf {I}}^{T}) \\ {}&-2\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}){\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{F} {\textbf {U}}^{T}))+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}{} {\textbf {V}}_{F}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{F}{} {\textbf {U}}^{T}))+\text{Tr}(\Psi {\textbf {U}}^{T}). \end{aligned}$$

The derivative of ${\mathcal {L}}_{O_{EF}}$ with respect to ${\textbf {U}}$ is:

$$\begin{aligned} \frac{{\mathcal {L}}_{O_{EF}}}{\partial {\textbf {U}}}=\,&-2{\mathcal {P}}_{\Omega } ({\textbf {X}}){\textbf {V}}_{F}+2r{\textbf {I}}{} {\textbf {V}}_{F}+2{\mathcal {P}}_{\Omega } ({\textbf {U}}{} {\textbf {V}}_{F}^{T}){\textbf {V}}_{F}+\Psi . \end{aligned}$$

Using KKT conditions $\psi _{ik}u_{ik}=0$, the following equality for $u_{ik}$ is obtained

$$\begin{aligned} -&({\mathcal {P}}_{\Omega }({\textbf {X}}){\textbf {V}}_{F})_{ik}u_{ik}+r({\textbf {I}} {\textbf {V}}_{F})_{ik}+({\mathcal {P}}_{\Omega }({\textbf {U}}{} {\textbf {V}}_{F}^{T}){\textbf {V}}_{F})_{ik}u_{ik}=0, \end{aligned}$$

which leads to the following updating formula:

$$\begin{aligned} u_{ik}\leftarrow u_{ik}\frac{\Big ({\mathcal {P}}_{\Omega }({\textbf {X}}){\textbf {V}}_{F}\Big )_{ik}}{\Big ({\mathcal {P}}_{\Omega } ({\textbf {U}}{} {\textbf {V}}_{F}^{T}+r){\textbf {V}}_{F}\big )_{ik}}. \end{aligned}$$

Thus, we prove the Theorem 1. $\square$

The objective function $O_{PE}$ in Eq. (5) is established using F-norm formulation, thus $O_{PE}$ is certainly bounded from below by zero. To finish the proof of Theorem 3, we need to indicate $O_{PE}$ is non-increasing under the update steps in Eq. (6). Similar to the procedures described in Lee and Seung (1999), the auxiliary function used in Expectation-Maximization algorithm (Dempster et al. 1977; Saul and Pereira 1997) can also be involved in our proofs. The notion of the auxiliary function is given as follows:

Definition 1

(Dempster et al. 1977) $T(v, v^{'})$ is an auxiliary function for E(v) if the following conditions

$$\begin{aligned} T(v, v^{'})\ge E(v), \quad T(v, v)=E(v) \end{aligned}$$

are satisfied.

It is verified that the auxiliary function is beneficial for proving the following lemma.

Lemma 1

(Dempster et al. 1977; Saul and Pereira 1997) If T is an auxiliary function of E, then E is non-increasing under the update

$$\begin{aligned} v^{t+1}=\arg \min _{v}T(v, v^{t}). \end{aligned}$$

(7)

Proof

Since $v^{t+1}=\arg \min \limits _{v}T(v, v^{t})$, thus for $v^{t}$, we can obtain

$$\begin{aligned} T(v^{t+1}, v^{t})\le T(v^{t}, v^{t}). \end{aligned}$$

While $E(v^{t+1})=T(v^{t+1}, v^{t+1})$, according to Definition 1, we get

$$\begin{aligned} E(v^{t+1})\le T(v^{t+1}, v^{t})\le T(v^{t}, v^{t})=E(v^{t}). \end{aligned}$$

In what follows, we will indicate that the update formulas for ${\textbf {U}}_{1}, {\textbf {U}}_{2}, {\textbf {V}}_{2}$ in Eq. (6) is accurately the update in Eq. (7) by defining the suitable auxiliary functions $T(v, v^{t})$ for Eq. (5).

According to the objective function $O_{PE}$ in Eq. (6), we have

$$\begin{aligned} O_{PE}&=\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}){\mathcal {P}}_{\Omega ^{T}}({\textbf {X}}^{T}) -2r\text{Tr}({\textbf {I}}{} {\textbf {X}}^{T}) +2r\text{Tr}({\textbf {I}}{\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T}))\\&\quad +2r\text{Tr}({\textbf {I}}{\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T}))+2r^{2} \text{Tr}({\textbf {I}}{} {\textbf {I}}^{T})-2\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T})) \\&\quad-2\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {X}}){\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{2}\ textbf{U}_{2}^{T}))+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}) {\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T})) \\&\quad+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) +\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) \end{aligned}$$

Taking

$$\begin{aligned} E_{{\textbf {U}}_{1}}&=2\text{Tr}({\mathcal {P}}_{\Omega }(r{\textbf {I}}-{\textbf {X}}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T}))+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}) {\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T})) \\ {}&\quad+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})),\\ E_{{\textbf {U}}_{2}}&=2\text{Tr}({\mathcal {P}}_{\Omega }(r{\textbf {I}}-{\textbf {X}}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) +\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) \\ {}&\quad+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})),\\ E_{{\textbf {V}}_{2}}&=2\text{Tr}({\mathcal {P}}_{\Omega }(r{\textbf {I}}-{\textbf {X}}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) +\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})) \\ {}&\quad+\text{Tr}({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T})), \end{aligned}$$

as the part which are only relevant to ${\textbf {U}}_{1}$, ${\textbf {U}}_{2}$ and ${\textbf {V}}_{2}$ in $O_{P}$, respectively, where $k\in K_{1}$ and $l\in K-K_{1}$. Considering any element $u_{ik}$ in ${\textbf {U}}_{1}$, $u_{il}$ in ${\textbf {U}}_{2}$ and $v_{jl}$ in ${\textbf {V}}_{2}$, we use $E_{{\textbf {U}}_{1}}(u_{ik})$, $E_{{\textbf {U}}_{2}}(u_{il})$ and $E_{{\textbf {V}}_{2}}(v_{jl})$ to denote the parts which includes $u_{ik}$, $u_{il}$ and $v_{jl}$ in Eq. (8), respectively. It is easy to check that the first-order derivatives of $E_{{\textbf {U}}_{ik}}$, $E_{{\textbf {U}}_{ik}}$ and $E_{{\textbf {U}}_{ik}}$ as follows:

$$\begin{aligned} E_{{\textbf {U}}_{1}}^{'}(u_{ik})&=\big (-2{\mathcal {P}}_{\Omega } ({\textbf {X}}){\textbf {V}}_{1}+2r{\textbf {IV}}_{1}+2{\mathcal {P}}_{\Omega } ({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\textbf {V}}_{1}+2{\mathcal {P}}_{\Omega } ({\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}){\textbf {V}}_{1}\big )_{ik},\\ E_{{\textbf {U}}_{2}}^{'}(u_{il})&=\big (-2{\mathcal {P}}_{\Omega } ({\textbf {X}}){\textbf {V}}_{2}+2r{\textbf {IV}}_{2}+2{\mathcal {P}}_{\Omega } ({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}){\textbf {V}}_{2}+2{\mathcal {P}}_{\Omega } ({\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}){\textbf {V}}_{2}\big )_{il},\\ E_{{\textbf {V}}_{2}}^{'}(v_{jl})&=\big (-2{\mathcal {P}}_{\Omega ^{T}} ({\textbf {X}}^{T}){\textbf {U}}_{2}+2r{\textbf {I}}^{T}{} {\textbf {U}}_{2}+2{\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T}){\textbf {U}}_{2}+2{\mathcal {P}}_{\Omega ^{T}} ({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T}){\textbf {U}}_{2}\big )_{jl}. \end{aligned}$$

Similarly, the second-order derivatives of $E_{{\textbf {U}}_{1}}(u_{ik})$, $E_{{\textbf {U}}_{2}}(u_{il})$ and $E_{{\textbf {V}}_{2}}(v_{jl})$ are gives as

$$\begin{aligned} E_{{\textbf {U}}_{1}}^{''}(u_{ik})=2\big ({\textbf {V}}_{1}^{T}{} {\textbf {V}}_{1}\big )_{kk}, \quad E_{{\textbf {U}}_{2}}^{''}(u_{il})=2\big ({\textbf {V}}_{2}^{T}{} {\textbf {V}}_{2}\big )_{ll}, \quad E_{{\textbf {V}}_{2}}^{''}(v_{jl})=2\big ({\textbf {U}}_{2}^{T}{} {\textbf {U}}_{2}\big )_{ll}. \end{aligned}$$

$\square$

Since the update rule proposed in this paper is essentially element-wise, it is sufficient to indicate that each $E_{{\textbf {U}}_{1}}(u_{ik}), E_{{\textbf {U}}_{2}}(u_{il}), E_{{\textbf {V}}_{2}}(v_{jl})$ are non-increasing under the update formulas of Eq. (7).

Lemma 2

Function

$$\begin{aligned}T(u_{ik}, u_{ik}^{t})&=E_{{\textbf {U}}_{1}}(u_{ik}^{t})+E_{{\textbf {U}}_{1}}^{'}(u_{ik}^{t})(u_{ik}-u_{ik}^{t}) \nonumber \\ {}&\quad +\frac{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T} +r\big ){\textbf {V}}_{1}\Big )_{ik}}{u_{ik}^{t}}(u_{ik}-u_{ik}^{t})^{2},\nonumber \\&\quad T(u_{il}, u_{il}^{t})=E_{{\textbf {U}}_{2}}(u_{il}^{t})+E_{{\textbf {U}}_{2}}^{'}(u_{il}^{t}) (u_{il}-u_{il}^{t}) \nonumber \\&\quad +\frac{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T} +r\big ){\textbf {V}}_{2}\Big )_{il}}{u_{il}^{t}}(u_{il}-u_{il}^{t})^{2},\nonumber \\&\quad T(v_{jl}, v_{jl}^{t})=E_{{\textbf {V}}_{2}}(v_{jl}^{t})+E_{{\textbf {V}}_{2}}^{'}(v_{jl}^{t})(v_{jl}-v_{jl}^{t}) \nonumber \\&\quad +\frac{\Big ({\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{1}{} {\textbf {U}}_{1}^{T}+{\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T} +r\big ){\textbf {U}}_{2}\Big )_{jl}}{v_{jl}^{t}}(v_{jl}-v_{jl}^{t})^{2}, \end{aligned}$$

(8)

are auxiliary functions for $E_{{\textbf {U}}_{1}}(u_{ik}), E_{{\textbf {U}}_{2}}(u_{il}), E_{{\textbf {V}}_{2}}(v_{jl})$, respectively, which are only associated with $u_{ik}, u_{il}$ and $v_{jl}$ in $O_{P}$.

Proof

Since $T(u_{ik}^{t}, u_{ik}^{t})=E_{{\textbf {U}}_{1}}(u_{ik}^{t}), T(u_{il}^{t}, u_{il}^{t})=E_{{\textbf {U}}_{2}}(u_{il}^{t})$ and $T(v_{jl}^{t}, v_{jl}^{t})=E_{{\textbf {V}}_{2}}(v_{jl}^{t})$ are obvious, we only need to specify $T(u_{ik}, u_{ik}^{t})\ge E_{{\textbf {U}}_{1}}(u_{ik}^{t}), T(u_{il}, u_{il}^{t})\ge E_{{\textbf {U}}_{2}}(u_{il}^{t})$ and $T(v_{jl}, v_{jl}^{t})\ge E_{{\textbf {V}}_{2}}(v_{jl}^{t})$. To achieve this, we compare the Taylor series expansion of $E_{{\textbf {U}}_{1}}(u_{ik}), E_{{\textbf {U}}_{2}}(u_{il}), E_{{\textbf {V}}_{2}}(v_{jl})$ at $u_{ik}^{t}, u_{il}^{t}$ and $v_{jl}^{t}$, respectively.

$$\begin{aligned} E_{{\textbf {U}}_{1}}(u_{ik})&=E_{{\textbf {U}}_{1}}(u_{ik}^{t})+E_{{\textbf {U}}_{1}}^{'}(u_{ik}^{t})(u_{ik}-u_{ik}^{t}) +\frac{E_{{\textbf {U}}_{1}}^{''}(u_{ik}^{t})}{2}(u_{ik}-u_{ik}^{t})^{2}\\ E_{{\textbf {U}}_{2}}(u_{il})&=E_{{\textbf {U}}_{2}}(u_{il}^{t})+E_{{\textbf {U}}_{2}}^{'}(u_{il}^{t})(u_{il}-u_{il}^{t}) +\frac{E_{{\textbf {U}}_{2}}^{''}(u_{il}^{t})}{2}(u_{il}-u_{il}^{t})^{2}\\ E_{{\textbf {V}}_{2}}(v_{jl})&=E_{{\textbf {V}}_{2}}(v_{jl}^{t})+E_{{\textbf {V}}_{2}}^{'}(v_{jl}^{t})(v_{jl}-v_{jl}^{t}) +\frac{E_{{\textbf {V}}_{2}}^{''}(v_{jl}^{t})}{2}(v_{jl}-v_{jl}^{t})^{2} \end{aligned}$$

with the expression of $T(u_{ik}, u_{ik}^{t})$ to find that $T(u, u_{ik}^{t})\ge E_{{\textbf {U}}_{1}}(u_{ik}^{t})$ is equivalent to

$$\begin{aligned} \frac{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T} +r\big ){\textbf {V}}_{1}\Big )_{ik}}{u_{ik}^{t}}\ge \frac{E_{{\textbf {U}}_{1}}^{''}(u_{ik}^{t})}{2} =\big ({\textbf {V}}_{1}^{T}{} {\textbf {V}}_{1}\big )_{kk}, \end{aligned}$$

which follows $\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T} +r\big ){\textbf {V}}_{1}\Big )_{ik}\ge u_{ik}^{t}\big ({\textbf {V}}_{1}^{T}{} {\textbf {V}}_{1}\big )_{kk}.$ On the other hand, since

$$\begin{aligned} \Big (\big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}\big ){\textbf {V}}_{1}\Big )_{ik} =\sum _{p=1}^{K_{1}}u_{ip}^{t}({\textbf {V}}_{1}^{T}{} {\textbf {V}}_{1})_{pk}\ge u_{ik}^{t}({\textbf {V}}_{1}^{T}{} {\textbf {V}}_{1})_{kk}. \end{aligned}$$

Therefore, $T(u_{ik}, u_{ik}^{t})\ge E_{{\textbf {U}}_{1}}(u_{ik}^{t})$, similar proofs to $T(u_{il}, u_{il}^{t})\ge E_{{\textbf {U}}_{2}}(u_{il}^{t})$ and $T(v_{jl}, v_{jl}^{t})\ge E_{{\textbf {V}}_{2}}(v_{jl}^{t})$. Thus, the results are obtained.

Based on preceding analysis, it is easy to get the proof of Theorem 3. $\square$

Proof of Theorem 3

Replacing $T(u_{ik}, u_{ik}^{t}), T(u_{il}, u_{il}^{t})$ and $T(v_{jl}, v_{jl}^{t})$ in Eq. (8) by Eq. (6) leads to the following update formulas

$$\begin{aligned} u_{ik}^{t+1}=\arg \min _{u_{ik}}T(u_{ik},u_{ik}^{t}), u_{il}^{t+1}=\arg \min _{u_{il}}T(u_{il},u_{il}^{t}), v_{jl}^{t+1}=\arg \min _{v_{jl}}T(v_{jl}, v_{jl}^{t}). \end{aligned}$$

To find $u_{ik}$ subject to $T(u_{ik}, u_{ik}^{'})$ reaches to minimum at $u_{ik}$, set

$$\begin{aligned} \frac{\partial T(u_{ik}, u_{ik}^{'})}{\partial u_{ik}}=\,&E_{{\textbf {U}}_{1}}^{'}(u_{ik}^{t})+2\frac{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}\ {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^{T}+r\big ){\textbf {V}}_{1}\Big )_{ik}}{u_{ik}^{t}}(u_{ik}-u_{ik}^{t}) =0, \end{aligned}$$

which leads to $u_{ik}=u_{ik}^{t}\frac{\big ({\mathcal {P}}_{\Omega }({\textbf {X}}){\textbf {V}}_{1}\big )_{ik} }{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2} {\textbf {V}}_{2}^{T}+r\big ){\textbf {V}}_{1}\Big )_{ik}}.$ Thus, $u_{ik}^{t+1}=u_{ik}^{t}\frac{\big ({\mathcal {P}}_{\Omega }({\textbf {X}}){\textbf {V}}_{1}\big )_{il} }{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2} {\textbf {V}}_{2}^{T}+r\big ){\textbf {V}}_{1}\Big )_{jl}}.$ Similarly, we can prove

$$\begin{aligned} u_{il}^{t+1}&=\arg \min _{u_{il}}T(u_{il}, u_{il}^{'})=u_{il}^{t}\frac{\big ({\mathcal {P}}_{\Omega }({\textbf {X}}){\textbf {V}}_{2}\big )_{il} }{\Big ({\mathcal {P}}_{\Omega }({\textbf {U}}_{1}{} {\textbf {V}}_{1}^{T}+{\textbf {U}}_{2}{} {\textbf {V}}_{2}^ {T}+r\big ){\textbf {V}}_{2}\Big )_{il}},\\ v_{jl}^{t+1}&=\arg \min _{v_{jl}}T(v_{jl}, v_{jl}^{'})=v_{jl}^{t}\frac{\Big ({\mathcal {P}}_{\Omega } ({\textbf {X}}^{T}){\textbf {U}}_{2}\Big )_{jl}}{\Big ({\mathcal {P}}_{\Omega ^{T}}({\textbf {V}}_{1} {\textbf {U}}_{1}^{T}+{\textbf {V}}_{2}{} {\textbf {U}}_{2}^{T}+r\big ){\textbf {U}}_{2}\Big )_{jl} }. \end{aligned}$$

Since Eq. (8) is the auxiliary functions for $E_{{\textbf {U}}_{1}}(u_{ik}), E_{{\textbf {U}}_{2}}(u_{il})$ and $E_{{\textbf {V}}_{2}}(v_{jl})$, which are non-increasing under $u_{ik}^{t+1}, u_{il}^{t+1}$ and $v_{jl}^{t+1}$. Therefore, Eq. (5) is non-increasing under the update rules in Eq. (6). $\square$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Zhou, X., Chen, L. et al. Explainable recommendations with nonnegative matrix factorization. Artif Intell Rev 56 (Suppl 3), 3927–3955 (2023). https://doi.org/10.1007/s10462-023-10619-9

Download citation

Published: 31 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10462-023-10619-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Explainable recommendations with nonnegative matrix factorization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High Dimensional Explicit Feature Biased Matrix Factorization Recommendation

Integrating user-side information into matrix factorization to address data sparsity of collaborative filtering

Explainable recommendation with fusion of aspect information

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Theorem 1

Definition 1

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now