Low-Rank Subspace Override for Unsupervised Domain Adaptation | SpringerLink
Skip to main content

Low-Rank Subspace Override for Unsupervised Domain Adaptation

  • Conference paper
  • First Online:
KI 2020: Advances in Artificial Intelligence (KI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12325))

Included in the following conference series:

  • 1462 Accesses

Abstract

Current supervised learning models cannot generalize well across domain boundaries, which is a known problem in many applications, such as robotics or visual classification. Domain adaptation methods are used to improve these generalization properties. However, these techniques suffer either from being restricted to a particular task, such as visual adaptation, require a lot of computational time and data, which is not always guaranteed, have complex parameterization, or expensive optimization procedures. In this work, we present an approach that requires only a well-chosen snapshot of data to find a single domain invariant subspace. The subspace is calculated in closed form and overrides domain structures, which makes it fast and stable in parameterization. By employing low-rank techniques, we emphasize on descriptive characteristics of data. The presented idea is evaluated on various domain adaptation tasks such as text and image classification against state of the art domain adaptation approaches and achieves remarkable performance across all tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aljundi, R., Emonet, R., Muselet, D., Sebban, M.: Landmarks-based kernelized subspace alignment for unsupervised domain adaptation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 07–12 June, pp. 56–63. IEEE, June 2015

    Google Scholar 

  2. Blitzer, J., Foster, D., Kakade, S.: Domain adaptation with coupled subspaces. J. Mach. Learn. Res. 15, 173–181 (2011)

    Google Scholar 

  3. Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Boosting for transfer learning. In: Proceedings of the 24th International Conference on Machine Learning - ICML 2007, pp. 193–200. ACM Press, New York (2007)

    Google Scholar 

  4. Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: 31st International Conference on Machine Learning, ICML 2014, vol. 2, pp. 988–996 (2014)

    Google Scholar 

  5. Elhadji-Ille-Gado, N., Grall-Maes, E., Kharouf, M.: Transfer learning for large scale data using subspace alignment. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), vol. 2018-January, pp. 1006–1010. IEEE, December 2017

    Google Scholar 

  6. Fernando, B., Habrard, A., Sebban, M., Tuytelaars, T.: Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2960–2967 (2013)

    Google Scholar 

  7. Ghifary, M., Balduzzi, D., Kleijn, W.B., Zhang, M.: Scatter component analysis: a unified framework for domain adaptation and domain generalization. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1414–1430 (2017)

    Article  Google Scholar 

  8. Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2066–2073 (2012)

    Google Scholar 

  9. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  10. Kierzkowski, J., Smoktunowicz, A.: Block normal matrices and Gershgorin-type discs. Electron. J. Linear Algebra 22(October), 1059–1069 (2011)

    MathSciNet  MATH  Google Scholar 

  11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  12. Liu, P., Yang, P., Huang, K., Tan, T.: Uniform low-rank representation for unsupervised visual domain adaptation. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 216–220, November 2015

    Google Scholar 

  13. Long, M., Cao, Y., Cao, Z., Wang, J., Jordan, M.I.: Transferable representation learning with deep adaptation networks. IEEE Trans. Pattern Anal. Mach. Intell. PP(c), 1 (2018)

    Google Scholar 

  14. Long, M., Wang, J., Ding, G., Pan, S.J., Yu, P.S.: Adaptation regularization: a general framework for transfer learning. IEEE Trans. Knowl. Data Eng. 26(5), 1076–1089 (2014)

    Article  Google Scholar 

  15. Long, M., Wang, J., Ding, G., Sun, J., Yu, P.S.: Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2200–2207 (2013)

    Google Scholar 

  16. Long, M., Wang, J., Sun, J., Yu, P.S.: Domain invariant transfer kernel learning. IEEE Trans. Knowl. Data Eng. 27(6), 1519–1532 (2015)

    Article  Google Scholar 

  17. Long, M., Zhu, H., Wang, J., Jordan, M.I.: Deep transfer learning with joint adaptation networks. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML 2017, pp. 2208–2217. JMLR.org (2017)

    Google Scholar 

  18. Mahadevan, S., Mishra, B., Ghosh, S.: A unified framework for domain adaptation using metric learning on manifolds. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11052, pp. 843–860. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10928-8_50

    Chapter  Google Scholar 

  19. Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)

    Article  Google Scholar 

  20. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  21. Raab, C., Schleif, F.-M.: Sparse transfer classification for text documents. In: Trollmann, F., Turhan, A.-Y. (eds.) KI 2018. LNCS (LNAI), vol. 11117, pp. 169–181. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00111-7_15

    Chapter  Google Scholar 

  22. Raab, C., Schleif, F.M.: Low-Rank Subspace Override for Unsupervised Domain Adaptation. arXiv:1907.01343 (2019)

  23. Schleif, F., Gisbrecht, A., Tiño, P.: Supervised low rank indefinite kernel approximation using minimum enclosing balls. Neurocomputing 318, 213–226 (2018)

    Article  Google Scholar 

  24. Shao, J., Huang, F., Yang, Q., Luo, G.: Robust prototype-based learning on data streams. IEEE Trans. Knowl. Data Eng. 30(5), 978–991 (2018)

    Article  Google Scholar 

  25. Shao, M., Kit, D., Fu, Y.: Generalized transfer subspace learning through low-rank constraint. Int. J. Comput. Vis. 109(1–2), 74–93 (2014). https://doi.org/10.1007/s11263-014-0696-6

    Article  MathSciNet  MATH  Google Scholar 

  26. Sun, B., Feng, J., Saenko, K.: Return of frustratingly easy domain adaptation. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12–17 February 2016, pp. 2058–2065 (2016)

    Google Scholar 

  27. Sun, B., Saenko, K.: Deep CORAL: correlation alignment for deep domain adaptation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 443–450. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_35

    Chapter  Google Scholar 

  28. Talwalkar, A., Kumar, S., Mohri, M.: Sampling methods for the Nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)

    MathSciNet  MATH  Google Scholar 

  29. Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep Domain Confusion: Maximizing for Domain Invariance. CoRR abs/1412.3 (2014)

    Google Scholar 

  30. Wang, J., Chen, Y., Yu, H., Huang, M., Yang, Q.: Easy transfer learning by exploiting intra-domain structures. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 1210–1215. IEEE, July 2019

    Google Scholar 

  31. Wang, J., Feng, W., Chen, Y., Yu, H., Huang, M., Yu, P.S.: Visual domain adaptation with manifold embedded distribution alignment. In: 2018 ACM Multimedia Conference on Multimedia Conference - MM 2018, pp. 402–410. ACM Press, New York (2018)

    Google Scholar 

  32. Weiss, K., Khoshgoftaar, T.M., Wang, D.D.: A survey of transfer learning. J. Big Data 3(1), 1–40 (2016). https://doi.org/10.1186/s40537-016-0043-6

    Article  Google Scholar 

  33. Williams, C., Seeger, M.W.: Using the Nystrom method to speed up Kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS Proceedings, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)

    Google Scholar 

  34. Zhang, J., Li, W., Ogunbona, P.: Joint geometrical and statistical alignment for visual domain adaptation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5150–5158. IEEE, July 2017

    Google Scholar 

Download references

Acknowledgment

We are thankful for support in the FuE program Informations- und Kommunikationstechnik of the StMWi, project OBerA, grant number IUK-1709-0011// IUK530/010.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Christoph Raab or Frank-Michael Schleif .

Editor information

Editors and Affiliations

Appendices

A Appendix A Proof of Subspace Override Bound

Theorem 1

Given two rectangular matrices \(\mathbf {X}_t,\mathbf {X}_s \in \mathbb {R}^{n \times d}\) with \(n,d > 1\) and rank of \(\mathbf {X}_t\) and \(\mathbf {X}_s > 1\). The norm \(\left\Vert \mathbf {X}_s^l-\mathbf {X}_t^l\right\Vert _F^2\) in the subspace \(\mathbb {R}^l\) induced by normalized subspace projector \(\mathbf {M}\in \mathbb {R}^{n \times l}\) with \( \mathbf {M}^T\mathbf {M} = \mathbf {I} \) is bounded by

$$\begin{aligned} E_{SO} = \left\Vert \mathbf {X}_s^l-\mathbf {X}_t^l\right\Vert _F^2 < \sum _{i=1}^{l+1} (\sigma _i(\mathbf {X}_s) - \sigma _i(\mathbf {X}_t))^2 \le \left\Vert \mathbf {X}_s-\mathbf {X}_t\right\Vert _F^2. \end{aligned}$$
(13)

Following  [9] the squared Frobenius norm of a matrix difference between two matrices can be bounded by

$$\begin{aligned} \sum _{i=1}^q (\sigma _i(\mathbf {X}_s) - \sigma _i(\mathbf {X}_t))^2 \le \left\Vert \mathbf {X}_s-\mathbf {X}_t\right\Vert _F^2, \end{aligned}$$
(14)

where \(q = min(n,d)\) and \(\sigma _i(\cdot )\) is the i-th singular value of the respective matrix in descending order. However, the subspace matrices \(\mathbf {X}_s^l\) and \(\mathbf {X}_t^l\) are a special case due to the subspace override of the projector \(\mathbf {M} =\mathbf {U}_t^l{\mathbf {U}_s^l}^{-1} \), because

$$\begin{aligned} \left\Vert \mathbf {X}_s^l-\mathbf {X}_t^l\right\Vert _F^2&= \left\Vert \mathbf {M}\mathbf {U}_s^l \varvec{\varSigma }_s^l - \mathbf {U}_t^l \varvec{\varSigma }_t^l\right\Vert _F^2 = \left\Vert \mathbf {U}_t^l\varvec{\varSigma }_s^l - \mathbf {U}_t^l\varvec{\varSigma }_t^l \right\Vert _F^2 \end{aligned}$$
(15)
$$\begin{aligned}&= \left\Vert \mathbf {U}_t^l\varvec{\varSigma }_s^l\right\Vert _F^2 + \left\Vert \mathbf {U}_t^l\varvec{\varSigma }_t^l\right\Vert _F^2 - 2Tr({\varvec{\varSigma }_s^l}^T {\mathbf {U}_t^l}^T \mathbf {U}_t^l \varvec{\varSigma }_t^l) \end{aligned}$$
(16)
$$\begin{aligned}&= \left\Vert \varvec{\varSigma }_s^l\right\Vert _F^2 + \left\Vert \varvec{\varSigma }_t^l\right\Vert _F^2 - 2Tr({\varvec{\varSigma }_s^l}^T \varvec{\varSigma }_t^l) \end{aligned}$$
(17)
$$\begin{aligned}&= \sum _{i=1}^l \sigma _i^2(\mathbf {X}_s^l) +\sum _{i=1}^l \sigma _i^2(X_t^l) - 2\sum _{i=1}^l (\sigma _i(X_s^l) \cdot \sigma _i(X_t^l))\end{aligned}$$
(18)
$$\begin{aligned}&= \sum _{i=1}^l (\sigma _i(\mathbf {X}_s^l) - \sigma _i(\mathbf {X}_t^l))^2. \end{aligned}$$
(19)

The important fact in the right part of Eq. (16) and (17) is that we do not rely on the bound of the Frobenius inner product as in the proof for Eq. (14) [9, p. 459], because \({\mathbf {U}_t^l}^T\mathbf {U}_t^l = \mathbf {I}\). Therefore, we can directly compute the Frobenius inner product of the the diagonal matrices \(\varvec{\varSigma }_s^l\) and \(\varvec{\varSigma }_t^l\), which is simply the sum of the product of the singular values. Consequently follows for \(l+1\) and \((\sigma _{l+1}(\mathbf {X}_s) - \sigma _{l+1}(\mathbf {X}_t))^2\ne 0\),

$$\begin{aligned} \left\Vert \mathbf {X}_s^l-\mathbf {X}_t^l\right\Vert _F^2< \sum _{i=1}^{l+1} (\sigma _i(\mathbf {X}_s) - \sigma _i(\mathbf {X}_t))^2 < \sum _{i=1}^q (\sigma _i(\mathbf {X}_s) - \sigma _i(\mathbf {X}_t))^2 \le \left\Vert \mathbf {X}_s-\mathbf {X}_t\right\Vert _F^2, \end{aligned}$$
(20)

where again \(q = min(n,d)\) and \(1<l<q\).

B Appendix B Component Analysis

We inspect the performance contribution of the different parts of the NSO approach. First, the exact solution to the optimization problem is called Subspace Override (SO). The approximation with uniform sampling is evaluated to study the impact of class-wise sampling on the performance. To show the efficiency of the subspace projection in original space, we include a kernelized version where we approximate the RBF-kernels of \(\mathbf {X}_s\) and \(\mathbf {X}_t\), respectively. The results are given in Table 7 and show that the Nyström approximation independent of the sampling strategy yields the best performance. This comes from the approximation of the subspace projection, where small values are likely to be zero, hence reducing noise further. The kernelized version is not recommended due to bad performance. Overall, as proposed, the class-wise NSO is recommended, because it is slightly better.

Table 7. Component evaluation of NSO in mean accuracy.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raab, C., Schleif, FM. (2020). Low-Rank Subspace Override for Unsupervised Domain Adaptation. In: Schmid, U., Klügl, F., Wolter, D. (eds) KI 2020: Advances in Artificial Intelligence. KI 2020. Lecture Notes in Computer Science(), vol 12325. Springer, Cham. https://doi.org/10.1007/978-3-030-58285-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58285-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58284-5

  • Online ISBN: 978-3-030-58285-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics