Correlation–Comparison Analysis as a New Way of Data-Mining: Application to Neural Data | SN Computer Science Skip to main content
Log in

Correlation–Comparison Analysis as a New Way of Data-Mining: Application to Neural Data

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

This paper aims to present a way of multidimensional data-mining termed correlation–comparison analysis (CCA). It was applied to neural data to demonstrate its utility in neuron-classification problem. The CCA represents a semi-quantitative way of inter-sample comparisons. The methodology comprises the generation of inter-parametric correlation and alpha-error (p value) matrices. The main step is p-comparison for the same parametric pair defined between the two samples. This comparison has a semi-quantitative binary character that does not involve issues, such as false discovery rate (FDR) in multiple comparisons. As a result, the outcomes obtained are: (1) a correlation match, (2) a correlation mismatch of the first kind, the main type of a correlation mismatch, (3) a correlation mismatch of the second kind, the strongest one but very rarely observed in biological systems and obtained on a very small number of parameters. The correlation mismatch of the first kind is the target mismatch, i.e., the mismatch of tracing interest and represents the very reason why the study itself is performed. The application of the CCA led to the effective neuromorphofunctional classification of caudate interneurons into appropriate clusters and their feature-based description. The CCA analysis is a multidimensional bi-sampled classification tool that can be very useful for similar samples to explain their differences.

Graphical Abstract

Correlation–comparison analysis is based on the following transcorrelative relations between correlations belonging two similar samples: (1) the particular inter-parameter correlations are matched between the two samples in terms of their statistical insignificance in both of the samples, or in terms of accomplished statistical significance in both of the samples and both of the correlations are characterized with the same direction, (2) correlations are mismatched in terms of accomplished statistical significance in one of the samples but not in the other one (correlation mismatch of the first kind, the main type of correlation mismatch), (3) correlations are mismatched in terms of accomplished statistical significance in both of the samples but the correlations are characterized with the opposite direction (correlation mismatch of the second kind, the strongest one but very rarely observed in biological systems and obtained for a very few number of parameters).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Abbreviations

SICC:

Self-implicative correlation cluster

SCCA:

Single-correlation comparison analysis

PP:

Parametric pair for which the existence of a correlation is observed

NGC:

Network grid-connection plot, NGC2D or NGC3D

MEP:

Matrix elementwise product, elementwise product between correlation coefficients, C-matrices

ISR:

Inter-sample relation

FDR:

False discovery rate

ISD:

Inter-sample difference, difference between observed samples

ISC:

Analytical inter-sample comparison in searching for correlative differences between them

IPR:

Inter-parametric relation

IPC:

Inter-parametric correlation

ICR:

Mutual (inter-)correlative relation

ICN:

(Inter-)correlative network

ICD:

Mutual inter-correlative determination

ICC:

(Inter-)correlative circle

GM plot:

Grid matrix plot/GM2D

DCA:

Differential analysis a correlative sample structure

CM1:

Correlation mismatch of the first kind

CM2:

Correlation mismatch of the second kind

CMI:

Correlative mismatch index, a general quantitative SCCA parameter

CP:

Correlation pair, PP with realized correlation

CFA:

Correlative factor analysis

CCA:

Correlation–comparison analysis

CCC:

Chain of correlative circles

CC:

Correlative chain or correlation in general between two end-point parameters in the chain

RC:

Radius of a circle with the maximal number of intersections with dendrites (NMAX circle); measures a distance of the maximal dendritic complexity from the neurosoma representing this way the position of the maximal dendritic arborization density

RCst :

Standardized RC parameter with respect to the AS parameter

RT:

Radius of a circle with the maximal number of terminal dendrites

NTD:

Number of the third-order dendrites, i.e., dendrites originating directly from the second order dendrites

NTHOD:

Total number of dendrites above the second order

DAX:

Neuron axialization index, a PCA parameter, quantifies how much a neuron is elongated in a 2D space, far mostly on account of the dendritic tree elongation

NSD:

Number of the second order dendrites, i.e., dendrites originating directly from the first order dendrites

NR:

Total number of dendrites above the first order

NPD:

Number of the first order (primary) dendrites, i.e., dendrites originating directly from a neurosoma

NMIN:

Minimal number of intersections with dendrites that a theoretical circle makes centered in the geometrical center of the neurosoma represents one of the neuron complexity parameters it can be thought of as the minimal complexity of a particular neuron

NMAXst :

Standardized NMAX parameter with respect to the NMIN parameter

NMAX:

Maximal number of intersections with dendrites that a theoretical circle makes centered in the geometrical center of the neurosoma represents one of the neuron complexity parameters it can be thought of as the maximal complexity of a particular neuron

NHOD:

Number of the higher order dendrites, i.e., dendrites originating directly from the third- and higher order dendrites

NDALL:

Total number of all dendrites of a neuron

MS:

Index of neurosoma asymmetry, as a reciprocal value of neurosoma circularity, represents the intensity of shape non-roundness and geometrical irregularity, quantifying the degree of difference from a circle in geometrical sense and from its symmetry

MDCBO:

Dendritic branching polarization index, measures how much a dendritic tree is asymmetric on account of its average branching degree

L st :

Standardized L parameter with respect to the AS parameter

L :

Total dendritic length, represents the sum of all individual dendritic lengths

DWDTHst:

Standardized DWDTH parameter with respect to the AS parameter

DWDTH:

Average dendritic width of a neuron

DSPst :

Standardized DSP parameter with respect to the ADF parameter

DSP:

Partial dendritic surface, represents a measure of dendritic non-complex density

DSI:

Dendritic surface of influence or the dendritic dispersion index of a neuron, i.e., a multidirectional dendritic radiation degree or the dendritic radiation index in all directions, represents a measure of dendritic dispersion in a 2D space obtained as the covariance of dendritic space coordinates

DS:

Fractal dimension of a skeletonized neuron image; represents one of the neuron complexity parameters

DPOL:

Dendritic orientation degree; expressed in angle degrees, measures a dendritic orientation in a 2D space, i.e., the orientation of (along) its dominant elongation axis

DO:

Fractal dimension of a neuron outline; one of neuron shape parameters, quantifies the shape of the neuron from one aspect of its observation (outline)

DN:

Fractal dimension of a neuron; one of neuron shape parameters, quantifies the shape of the neuron from one aspect of its observation (the entire image of it)

DCLD:

Dendritic clustering degree (index); measures the tendency of neuron dendritic clustering

DCBOst:

Standardized DCBO parameter with respect to the NMIN parameter

DCBO:

Dendritic centrifugal branching order; represents the average dendritic branching order

CDF:

Complexity of a dendritic field as a complexity parameter, it measures the total curvature and detailed image complexity of the dendritic arborization the most integrative of all complexity parameters

CDF(NTHOD)st:

Standardized CDF parameter with respect to the NTHOD parameter

CDF(NTD)st :

Standardized CDF parameter with respect to the NTD parameter

CDF(NSD)st :

Standardized CDF parameter with respect to the NSD parameter

CDF(NR)st :

Standardized CDF parameter with respect to the NR parameter

CDF(NPD)st :

Standardized CDF parameter with respect to the NPD parameter

CDF(NHOD)st :

Standardized CDF parameter with respect to the NHOD parameter

CDF(NDALL)st :

Standardized CDF parameter with respect to the NDALL parameter

CDF(DCBO)st :

Standardized CDF parameter with respect to the DCBO parameter

CDF(ADF)st :

Standardized CDF parameter with respect to the ADF parameter

ADF:

Surface area of a neuron dendritic arborization field; measures the size of the minimal region occupied by the neuron dendritic tree in a 2D space after removing the neurosoma

BV:

Bursting voltage of an action potential

BF:

Bursting frequency of an action potential

AS:

Surface area of a neurosoma; measures the soma size of the neuron

APNS:

Surface area of a neuron perineuronal space; measures the size of the total minimal region between dendrites

ANF:

Surface area of a neuron field; measures the size of the minimal region occupied by the neuron in a 2D space

AN:

Surface area of a neuron, i.e., the image area occupied by the entire neuron 2D binary image; measures the size of the neuron

ADT:

Surface area of a dendritic tree; measures the dendritic arborization size of the neuron

NSIN:

Neostriate interneurons

PIN:

Putaminal interneurons

CCL:

Caudate clusters of interneurons, namely, the CCL1 and CCL2, obtained by the applied clustering–classifying method

CIN:

Caudate interneurons

PCL:

Putaminal cluster of interneurons, obtained by the applied clustering–classifying method

References

  1. Elidan G. Copula Bayesian networks. Adv Neural Inf Process Syst. 2010;23:559–67. https://doi.org/10.5555/2997189.252.

    Article  Google Scholar 

  2. Bedford T, Cooke RM. Vines—a new graphical model for dependent random variables. Ann Stat. 2002;30:1031–68. https://doi.org/10.1214/aos/1031689016.

    Article  MathSciNet  MATH  Google Scholar 

  3. Fisher RA. Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population. Biometrika. 1915;10:507–21. https://doi.org/10.2307/2331838.

    Article  Google Scholar 

  4. Winterbottom A. A note on the derivation of Fisher’s transformation of the correlation coefficient. Am Stat. 2012;33:142–3. https://doi.org/10.1080/00031305.1979.10482682.

    Article  Google Scholar 

  5. Grbatinić I, Rajković N, Milosevic N. Computational RSM modelling of dentate nucleus neuron 2D image surface. Comput Methods Biomech Biomed Eng Imaging Vis. 2016;6:1–8. https://doi.org/10.1080/21681163.2016.1160798.

    Article  Google Scholar 

  6. Grbatinić I, Milosevic N, Maric D. Translaminar neuromorphotopological clustering and classification of dentate nucleus neurons. J Integr Neurosci. 2018;17:105–24. https://doi.org/10.31083/JIN-170044.

    Article  Google Scholar 

  7. Grbatinić I, Krstonošić B, Marić D, Purić N, Milosevic N. Computational RSM modeling of neuromorphofunctional relations of dentate nuclear neurons and dentatostriate inter-cluster mapping with the dentatostriate neural network reconstruction: RLSR/PCR Regression and Canonical Correlation Analysis. Annals of Behavioral Neuroscience, 2019, p. 168–96. https://doi.org/10.18314/abne.v2i1.1674.

  8. Grbatinić I, Milošević N, Krstonošić B. The neuromorphological caudate-putaminal clustering of neostriate interneurons: Kohonen self-organizing maps and supervised artificial neural networks with multivariate analysis. J Theor Biol. 2018;438:96–115. https://doi.org/10.1016/j.jtbi.2017.11.013.

    Article  MathSciNet  MATH  Google Scholar 

  9. Grbatinić I, Krstonošić B, Marić D, Milošević N. Morphological properties of the two types of caudate interneurons: Kohonen self-organizing maps and correlation-comparison analysis. Microsc Microanal. 2018;24:684–707. https://doi.org/10.1017/s1431927618015337.

    Article  MATH  Google Scholar 

  10. Rodgers J, Nicewander A. Thirteen ways to look at the correlation coefficient. Am Stat. 1988;42:59–66. https://doi.org/10.1080/00031305.1988.10475524.

    Article  Google Scholar 

  11. Croxton FE, Klein S, Cowden DJ. Applied general statistics. 3rd ed. London: I Pitnam and Sons; 1968. https://doi.org/10.2307/2284056.

    Book  MATH  Google Scholar 

  12. Grbatinić I, Milosevic N. Incipient UV-induced structural changes in neutrophil granulocytes: morphometric and texture analysis of two-dimensional digital images. Microsc Microanal. 2016;1:1–7. https://doi.org/10.1017/S1431927616000532.

    Article  Google Scholar 

  13. Grbatinić I, Milošević N. Classification of adult human dentate nucleus border neurons: Artificial neural networks and multidimensional approach. J Theor Biol. 2016;404:273–84. https://doi.org/10.1016/j.jtbi.2016.06.011.

    Article  MathSciNet  Google Scholar 

  14. Riley KF, Hobson MP, Bence, SJ. Mathematical methods for physics and engineering. Cambridge University Press; 2011. http://www.cambridge.org/9780521192736.

  15. Daniel WW, Cross CL. Biostatistics: a foundation for analysis in the health sciences, 11th edition. New York: Wiley; 2018. https://www.wiley.com/en-us/Biostatistics%3A+A+Foundation+for+Analysis+in+the+Health+Sciences%2C+11th+Edition-p-9781119496571.

  16. Wang X, Xu JZ, Conrey A, Mendelsohn L, Shriner D, Pirooznia M, et al. Whole genome sequence-based haplotypes reveal a single origin of the 1393 bp HBB deletion. J Med Genet. 2020;57:567–70. https://doi.org/10.1136/jmedgenet-2019-106698.

    Article  Google Scholar 

  17. Wang K, Yan Z, Zhang S, Bartholdy B, Eaves CJ, Bouhassira EE. Clonal origin in normal adults of all blood lineages and circulating hematopoietic stem cells. Exp Hematol. 2020;83:25–34. https://doi.org/10.1016/j.exphem.2020.01.005.

    Article  Google Scholar 

  18. Lee KH, Yoo JR, Kim YR, Heo ST. Phylogenetic analysis for the origin of typhoid fever outbreak on Jeju Island, Korea, in 2017. Infect Chemotherapy. 2020;52:421–6. https://doi.org/10.3947/ic.2020.52.3.421.

    Article  Google Scholar 

  19. Midro AT, Stasiewicz-Jarocka B, Borys J, Hubert E, Skotnicka B, Hassmann-Poznańska E, et al. Two unrelated families with variable expression of Fraser syndrome due to the same pathogenic variant in the FRAS1 gene. Am J Med Genet A. 2020;182:773–9. https://doi.org/10.1002/ajmg.a.61495.

    Article  Google Scholar 

  20. Hall MA. Correlation-based feature selection for machine learning. The University of Waikato; 2000. PhD thesis. https://www.cs.waikato.ac.nz/~mhall/thesis.pdf.

  21. Prensa L, Giménez-Amaya JM, Parent A. Morphological features of neurons containing calcium-binding proteins in the human striatum. J Comp Neurol. 1998;390:552–63 (ISSN 0021-9967).

    Article  Google Scholar 

  22. Kawaguchi Y. Physiological, morphological, and histochemical characterization of three classes of interneurons in rat neostriatum. J Neurosci. 1993;13:4908–23. https://doi.org/10.1523/jneurosci.13-11-04908.1993.

    Article  Google Scholar 

Download references

Acknowledgements

This study was supported by the Ministry of Education, Science and Technological Development, Republic of Serbia, project number III41031.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Grbatinić.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Ethical approval

The authors declare that this manuscript is submitted nowhere else except in the SN Computer Science Journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grbatinić, I., Krstonošić, B., Srebro, D. et al. Correlation–Comparison Analysis as a New Way of Data-Mining: Application to Neural Data. SN COMPUT. SCI. 4, 636 (2023). https://doi.org/10.1007/s42979-023-02086-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02086-4

Keywords