Kimesurface Representation and Tensor Linear Modeling of Longitudinal Data - PMC Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Apr 1.
Published in final edited form as: Neural Comput Appl. 2022 Jan 16;34(8):6377–6396. doi: 10.1007/s00521-021-06789-8

Kimesurface Representation and Tensor Linear Modeling of Longitudinal Data

Rongqian Zhang 1,4, Yupeng Zhang 2,4, Yuyao Liu 1,4, Yunjie Guo 3,4, Yueyang Shen 3,4, Daxuan Deng 3,4, Yongkai Joshua Qiu 2, Ivo D Dinov 4,5,*
PMCID: PMC9355340  NIHMSID: NIHMS1772272  PMID: 35936508

Abstract

Many modern techniques for analyzing time-varying longitudinal data rely on parametric models to interrogate the time-courses of univariate or multivariate processes. Typical analytic objectives include utilizing retrospective observations to model current trends, predict prospective trajectories, derive categorical traits, or characterize various relations. Among the many mathematical, statistical, and computational strategies for analyzing longitudinal data, tensor-based linear modeling offers a unique algebraic approach that encodes different characterizations of the observed measurements in terms of state indices.

This paper introduces a new method of representing, modeling, and analyzing repeated-measurement longitudinal data using a generalization of event order from the positive reals to the complex plane. Using complex time (kime), we transform classical time-varying signals as 2D manifolds called kimesurfaces. This kime characterization extends the classical protocols for analyzing time-series data and offers unique opportunities to design novel inference, prediction, classification, and regression techniques based on the corresponding kimesurface manifolds. We define complex time and illustrate alternative time-series to kimesurface transformations. Using the Laplace transform and its inverse, we demonstrate the bijective mapping between time-series and kimesurfaces. A proposed general tensor regression based linear model is validated using functional Magnetic Resonance Imaging (fMRI) data. This kimesurface representation method can be used with a wide range of machine learning algorithms, artificial intelligence tools, analytical approaches, and inferential techniques to interrogate multivariate, complex-domain, and complex-range longitudinal processes.

1. Introduction

Contemporary scientific studies rely on large volumes of multi-source, heterogeneous, and incomplete data anchored at specific spatiotemporal locations. Many state-of-the-art methods for handling time-varying longitudinal data involve parametric models that enable investigation of retrospective, current, and prospective process characterizations. Participant outcomes, experimental conditions (e.g., treatments or exposures), cross-sectional characteristics, and other information is often tracked at multiple points in time. Common strategies for analyzing repeated longitudinal measurements include mixed linear modeling, time-series analysis, growth and decay modeling, forecasting or prospective outcomes, prognostication, causality, survival analysis, and time-to-event inference [15].

To account for low signal-to-noise ratio, many study-designs employ repeated data acquisition at controlled experimental conditions. For each unit in the experiment, this leads to assembling multiple samples representing scalar, vector, matrix, or tensor data. In the simplest case, univariate data is collected for a longitudinal process and the goal is to predict the prospective behavior of the process using the observed retrospective (training) data. In more pragmatic settings, the data is represented as a (time) dynamic multi-dimensional array generalizing the classical samples × variables matrix format. In this paper, we build on the notion of complex time (kime) [6] to generalize the classical temporal time-series representation of repeated longitudinal data. Specifically, we model the observed repeated-measurement longitudinal data as 2D kime surfaces. The structure (e.g., geometry and topology) of these higher dimensional manifolds carries more information compared to the original time-series corresponding to cross-sectional kimesurface foliation curves. Model-based methods, such as tensor-based linear modeling [7], and model-free artificial intelligence (AI) approaches, such as unsupervised clustering and classification [3, 8, 9], can then be designed and applied to the kimesurface representations of the time-varying processes.

The main drivers of examining longitudinal data include (1) direct identification of intra-individual change (and stability) and/or of inter-individual differences (similarity); and (2) analysis of interrelationships in behavioral change, causes (determinants) of intra-individual change, and/or of determinants of inter-individual differences [10, 11]. Time-series methods for modeling continuous responses include analysis of response profiles, linear mixed-effects models, hierarchical linear model, and support vector regression [12, 13]. Response profile analysis is typically employed with a relatively small number of repeated measurements, obtained on a common set of occasions [12]. Linear mixed-effects models are generally applicable for data with expected interdependencies, or presence of explicit or implicit hierarchical structure [14]. Mathematical modeling and statistical inference on longitudinal data involves preprocessing to ensure stationary, proper representation of seasonal and non-seasonal effects, accounting for known interventions and outliers, regression estimation and classification analysis. The classical goals of time-series analysis include model fitting, parameter estimation, diagnostic checks, forecasting, and inference [15]. Many model-based approaches rely on autoregressive integrated moving averaging (ARIMA); however there are different classes of techniques for joint modeling of multivariate longitudinal data, e.g., models of evolution of the measured outcomes, marginal models, and models for latent evolution of latent variables [16]. Recent mathematical advances propose using geometric and topological techniques to represent, track, model, and analyze time-series and longitudinal processes [1719]. Other machine learning and artificial intelligence techniques for time-series analysis include recurrent neural networks, long short-term memory networks, and deep Boltzmann machines [3, 20].

Functional Magnetic Resonance Imaging (fMRI) represents an example of time-varying data that requires careful assessment of underlying model assumptions, method limitations, and data processing challenges associated with longitudinal designs, e.g., task design, sampling strategies, and within subject and cohort-level analyses [13, 21, 22]. Changes in the fMRI time-course corresponds to neuronal activity tracked as variations of the ratio of oxygenated to deoxygenated blood levels, subject to a mediating process known as the Hemodynamic Response Function (HRF). Under various stimuli, e.g., audio, visual, and tactile, a change in the observed fMRI signal is detected in the relevant brain region, subject to HRF mediation with certain response delay, e.g., 2–3 seconds. Magnetic resonance scanners are extremely sensitive to electromagnetic fields and can be used to track brain anatomical structure, tissue spectral composition, and functional network activity. Brain functional activation is a proxy of the level of oxygenated blood hemoglobin molecule, which contains four iron atoms whose magnetization response depends on their oxygen binding; hence oxygenated vs. deoxygenated blood have distinct diamagnetic and paramagnetic properties. As the concentration of oxygenated blood in a brain region reflects changes (increases or decreases) of neuronal activation in response to a specific stimulus, the regional fMRI signal is a proxy indicator highly correlated with the associated brain area processing the specific stimulation task [23].

Tensor-based linear modeling represents longitudinal data using higher-dimensional arrays (tensors) that encode different characterizations of the observed measurements in terms of state indices, e.g.,

Conditions× Repetitionexperimental design×SpaceX ×SpaceY×SpaceZ×Timespatiotemporal characterization×other .

For each given tensor element, the observed data may be a (coupled) scalar, vector, matrix, or another tensor. In general, the typical goals of tensor-based linear modeling (TLM) involve obtaining computed class-labels, deriving prospective trajectory forecasts, estimating Bayesian posterior predictive probabilities, and predicting of an outcome tensor Y of dimension Q1 × ⋯ × QM from another observed tensor X of dimension N × P1 × ⋯ × PL, where N is the number of observations [7]. One example of TLM inference is the prediction of several facial attributes from a set of recorded images [24]. In this application, the outcome tensor Y:Faces×Attributes is predicted using data in the observed feature tensor X:Faces×X×Y×ColorChannels [25]. Another application is the prediction of EEG from fMRI data [26]. Typically, EEG has low spatial but high-temporal resolution along the brain cortical surface, whereas fMRI provides better spatial but poorer temporal resolution over the entire 3D spatial domain. TLM allows predicting the association between these two non-invasive functional neuroimaging modalities. Such TLM can be used to enhance derived neuroimaging biomarkers, impute missing or incomplete spatiotemporal information, and derive better prediction for normal and pathological conditions [27, 28].

Appendix A provides some additional background on the tensor definition, representation, and operations. Tensors naturally provide a mechanism for extending unconstrained or regularized linear models for prediction and inference based on high-dimensional and longitudinal data. Consider an observed (training data) outcome Y:N×Q1××QM and a corresponding observed covariate tensor X:N×P1××PL. Then we can perform (tensor) linear-model fitting:

Y=X,BL+E.

where the two new tensors are B:P1××PL×Q1××QM, representing the TLM coefficient (effect-size) tensor, and E:N×Q1××QM, denoting the residual error tensor. Note that the first L modes of B contract the dimensions of the covariate tensor, X, which are not in the outcome tensor Y, and the last M modes of B expand along the outcome Y tensor modes in Y not present in the covariate tensor X. Assuming we have sufficient data to fit this model and estimate the effect tensor B and the residual error tensor E, then we can perform prediction and forecasting of the outcome tensor using classical linear modeling strategies. For a given outcome tensor index-set (q1, ⋯, qM) the TLM prediction of the outcome tensor element for each case 1 ≤ nN is:

Y[n,q1,,qM]p1=1P1pL=1PLX[n,p1,,pL]B[p1,,pL;q1,,qM].

To simplify the notation, the TLM model excludes the static tensor-intercept term and assumes that the observed data tensors Y and X are centralized.

This paper is focused on representing, modeling, and analyzing repeated-measurement longitudinal data as complex time (kime) surfaces. This characterization of time-varying information facilitates the analysis of classical time-series as higher-dimensional kimesurface manifolds. The manuscript is organized as follows. In Section 2, we define complex time and present alternative time-series to kimesurface transformations. The following Section 3 describes the general tensor regression based linear modeling approach. Section 4 includes applications of spacekime TLM analytics to functional Magnetic Resonance Imaging (fMRI). Finally, in Section 5, we summarize the spacekime analytics approach and interpret the reported findings. The supplementary materials section (Appendix) contains specific implementation details and additional examples.

2. Complex-time (kime) and Spacekime Representation

Complex time (kime) extends the familiar notion of event order (time) to the complex plane. The kime representation of longitudinal data involves indexing measurements by κ = te, where t and θ represent the longitudinal event order (time) and a directional kime-phase (repetition index), respectively. The kime-phase θ is indirectly related to the classical repeated random (independent and identically distributed) sampling during the process of recurrent measurement. The rationale for this extension of time from the positive reals () to the complex plane () is three-fold. First, in a classical mathematical sense, the longitudinal axis of spacetime represents only a positive cone over the field of the real numbers [29]. The time dimension forms a subgroup of the multiplicative group of the reals, however it’s not closed under addition. On the other hand, complex-time describes an algebraic prime field [30] that naturally extends the reals and hence, the classical temporal domain. Although time is ordered, whereas kime is not, the intrinsic time ordering is preserved in the kime magnitude. Second, the kime dimensionality lift addresses some of the physical problems of time [31]. Third, repeated-measurement longitudinal data are represented in terms of kime as 2D manifolds. This enables a higher-dimensional view of longitudinal processes and offers unique opportunities to design and validate innovative data-driven artificial intelligence (AI) and statistical approaches for modeling, analysis, inference, and prediction on temporal signals [6].

Although there are multiple strategies to transform longitudinal processes to kimesurfaces, we will demonstrate this transformation using two specific kimesurface representation methods [6]. Both methods can translate observed time-varying signals, such as spacetime fMRI data, into spacekime kimesurfaces. The first empirical strategy is based on discretely estimating the “missing/unobserved” kime-phases and directly mapping the observed longitudinal-indexed time-series into kime-indexed surfaces. The second analytical strategy for mapping between time-domain data and kime-domain data utilizes the Laplace Transformation [32].

For more specificity, we will explicate both time-series to kimesurface transformation strategies in terms of a finger-tapping fMRI experiment. In the empirical strategy, the longitudinal (time) indexing, x(t), includes one epoch of the fMRI design. The second component, kime-phase indexing, x(φ), runs over the number of epochs of one specific stimulus (e.g., finger-tapping or rest). These phase directions are generally unknown and may be randomly drawn (or estimated) from the appropriate kime-phase distribution for the specific stimulus condition. Figure 1 shows a direct mapping between fMRI time-series and their corresponding kimesurface counterparts. The entire fMRI time course is spliced and arranged according to the epoch curves into an approximate piecewise parametric surface indexed over kime with complex-time intensities derived from the corresponding complex-valued fMRI signal. The complex-value of the function f(t, φ) = Ae is in the range-space and has magnitude A = A(t, φ) and (range) phase θ = θ(t, φ) that correspond to the polar coordinate representation of complex time (kime), i.e., (t, φ).

Figure 1.

Figure 1.

Schematic diagram of an empirical mapping between an fMRI time-series and its corresponding kimesurface.

Figure 2 demonstrates two views of the same 3D scenes depicting an empirical kimesurface reconstruction of the on (stimulus) vs. off (rest) fMRI signal. The graphs show the kime domain areas where the on vs. off differences are small (surface valleys) or large (extreme high or low surface peaks). The former indicate kime ranges where the voxel location corresponding to the kimesurface does not exhibit significant brain activation. Whereas the latter, surface peak regions, correspond to kime areas of potentially statistically significant on vs. off brain activation associated with the cognitive stimuli. Note that the practical and statistical significance of these peaks and valleys can be evaluated using Gaussian random field theory of the excursion sets above certain threshold values [33, 34].

Figure 2.

Figure 2.

A pair of 3D surface renditions of the ON-OFF fMRI kimesurface differences at a fixed spatial location. Blue and yellow colors indicate positive and negative kimesurface differences, respectively.

To map longitudinal time-series into kimesurfaces using an alternative analytical strategy, we can employ the Laplace Transform (LT). LT converts complex-valued functions of positive real variables (e.g., time) to complex-valued functions defined on complex variables (e.g., kime). In general, for a function f(t):+, the continuous LT is defined by:

F(z)L(f)(z)=0f(t)eztdt:.

In our situation, complex-valued fMRI time-series signal f(t):+ can be represented as an ordered (discrete) sequence of complex-valued fMRI BOLD intensities:

{f(t)=at+ibt:1tT}.

The discrete LT of f is represented as an infinite series corresponding to the discretized LT integral above. For a specified step-size η > 0, the discrete Laplace transform L(f) is computed by:

Lη(f)(z)=ηk=0ezkηf(kη).

To demonstrate the Laplace Transform-based analytical strategy, we apply the Inverse Laplace Transform (ILT) on an fMRI kimesurface and then convert it back to the original fMRI time-series, see Figure 3. Given a complex function F(z), which represents a kimesurface, and a positive real argument t, the ILT algorithm, developed by Valsa and Brancik [35], involves iterative continuous approximations to generate an approximation of the output f(t)=L1(F)(t). This iterative approximation approach estimates the exponential term ezt using a series where each term is a Bromwich integral obtained through residue calculus [36]:

f(t)=L1(F)(t)=12πilimTγiTγ+iTeztF(z)dz=fc(t,a)+O(e2a)=ea2iγiγ+i(F(z)n=0[(1)n(n+12)(n+12)2π2+(azt)2])ds+O(e2a)=eatn=0(1)n Im{F[at+j(n+12)πt]}+O(e2a),

where a > 0 is a parameter that can be chosen arbitrarily large to control the error rate, and the subscript c stands for cosh(·), which is involved in the intermediate steps. In practice, the Euler transformation [37] may be used to accelerate the convergence of the alternating series above.

Figure 3.

Figure 3.

The duality between a real fMRI time-series (red curve on the right panel), its corresponding kimesurface (left panel), and the reconstructed fMRI time-source using the ILT on the kimesurface (blue scatterplot on the right panel).

3. Methods

3.1. Datasets

We tested the process of mapping observed time-series data to kimesurfaces, reconstruction of the time-series process, the subsequent analytical modeling, and statistical inference using real Functional Magnetic Resonance imaging (fMRI) data.

Specifically, we describe the results of a neuroimaging study using an “on-off” fMRI finger-tapping experiment, which models the brain response via an oscillatory step-wise characteristic function with a delayed correlation to the on-off stimulus paradigm. The observed data represents complex-valued fMRI time-series of the blood oxygenation dependent (BOLD) response to a dichotomous finger-tapping “on vs. off” (activation vs. rest) event-related experiment [38, 39]. The raw signal is acquired in the Fourier k-space over a period of 8-minutes tracking the functional BOLD signal of a normal volunteer during the finger-tapping “on-off” task. Using the inverse Fourier transform, the data was mapped into spacetime and stored as a 4D double-precision array of complex-valued intensities. The size of the data is about ½ GB (gigabytes) and the dimensions of the data are 64x×64y×40z3D spatial voxel, v×160ttime . Each epoch of the longitudinal (time) finger-tapping task represents a basic pattern of 10 ON (activation) time-points followed by 10 OFF (rest) time-points. The ON and OFF epochs are intertwined and repeated 8 times for a total of 160 time-points, each of about 3 seconds. We applied an established fMRI data preprocessing protocol, e.g., using the R package fmri, to motion-correct, skull-strip, register, and normalize the raw fMRI signal [40, 41]. The target of the registration was the LONI Probabilistic brain atlas [42], which allows anatomically parcellating the functional brain data into 28 left and 28 right hemisphere cortical regions. This last step was important for the subsequent three-phase statistical inference aiming to identify the spatial locations in the brain, {ν = (x, y, z)}, associated with the finger-tapping motor task.

3.2. Tensor-based linear modeling

In order to detect brain activated areas associated with the finger-tapping somatosensory motor task, we apply tensor-based linear modeling to the spacekime represented fMRI data. Our analytical protocol includes three phases. In phase 1, classical statistical methods are used to raw fMRI data in order to identify any Regions of Interest (ROIs) that may be activated by the external stimulus. In phase 2, we apply tensor regression to the corresponding kimesurface representations of the fMRI data. And finally, to reduce the false-positive rate, in phase 3, we apply post-hoc correction to the p-values computed in phase 2.

In the preprocessing step, the real- or complex-valued fMRI data can be co-registered or aligned with a specific brain atlas using linear or nonlinear transformation techniques [4244]. Then, we continued partitioning the brain anatomy into Regions of Interest (ROIs) by using the probabilistic atlas, which assigns likelihoods to be inside any 56 ROIs [42]. In this way, each voxel is effectively tagged with a label representing its most likely ROI anatomical localization. In this case, we used the LONI probabilistic brain atlas to tessellate the entire brain anatomical space into 56 complementary ROIs.

The aim of phase 1 analysis is to select important ROI candidates that can subsequently be interrogated voxel-by-voxel to further localize the statistical significance maps reflecting the activated brain areas associated with the finger-tapping task. We used a measure called temporal Contrast-to-noise Ratio (tCNR) [45] to represent the average signal-change or task-related variability relative to the non-task-related variability over time. In fMRI studies, two strategies are commonly used the estimate the tCNR [46]. The first one is based ion modeling the known stimulus design and utilizing the hemodynamic response function. The second strategy involves low-pass filtering and assumes only that most high-frequency fMRI components contain temporal noise. For a fixed ROI, the most general form of the tCNR is represented by:

tCNR=ΔSσnoiset,

where the numerator ΔS is the average signal-change reflecting task-related variability and the denominator σnoiset represents the non-task-related variability over time. Strong tCNR evidence of a task-specific effect within an ROI is indicated by tCNR ≫ 1, whereas tCNR ≈ 1 suggests the regional may not be involved in stimulus processing. We compute an estimate of the true fMRI signal at a given voxel location ν by using polynomial regression, low-pass filtering, or a convolution with a smoothing kernel. This smoother fMRI signal is used to compute the numerator, ΔSν, and the variability of the time-series noise, σv,noiset, is estimated as the difference between the observed and smoothed fMRI signals. The overall ROI tCNR is computed by pooling the local estimates, tCNRv=ΔSvσv,noiset, over all voxels ν ∈ ROI.

Parametric or non-parametric statistical tests, such as t-test and Wilcoxon test [47], can be used to quantify the ROI-wise stimulus related to brain activation. In our case, we used a t-test to assess the statistical significance in each of the 56 ROIs independently. The test-statistics can be computed using the observed tCNR vector (tCNR1, tCNR2, ⋯, tCNRN)T, where N is the number of voxels in the ROI. To control the type-I (false-positive) error rate, we used Bonferroni correction for multiple testing to adjust the resulting ROI p-values (α=0.000892).

In phase 2, the tensor linear regression models were estimated separately on each of the 56 ROIs to obtain efficient model parameter estimates. For each (irregularly shaped) ROI, we first encapsulated it with the smallest bounding box. This facilitates the tensor arithmetic by padding the fMRI intensities with zeros outside of the ROI boundaries. We can denote the general dimensions of the smallest bounding box for each ROI by a × b × c. Suppose Y:t×a×b×c is the response tensor, i.e., fMRI intensity and X:t×p×1 is the predictor tensor. Thus, the proposed tensor-based linear model is given by:

Y=X,BLtensorproduct+E,

where B:p×1×a×b×c is a tensor linear model coefficient and E:t×a×b×c is the residual error tensor.

The design tensor X contains the information corresponding to the on-off stimulus, i.e., finger-tapping task indicator {+1(On), −1(Off)}, adjusted by the Hemodynamic Response Function (HRF) which models the expected blood oxygen level dependent (BOLD) response. In addition, X includes a pair of polynomial drift terms; one of order one (linear) and the other of order two (quadratic) to adjust for low-frequency noise.

We fitted this model by using the R package MultiwayRegression [25], which estimates coefficients using penalized least-squares. Then, parametric or non-parametric statistical tests can be applied to the estimated BOLD coefficients β^ to enable statistical inference. In this case, we used t-tests on the estimated coefficients β^ in each ROI. The smaller the corresponding p-values are, the more significant the neuro-activation is in the respective brain area.

Phase 3 of the protocol involves post-hoc analysis, which is necessary for filtering potential false positive discoveries and only exposing true-positive brain activation regions. In this final phase of the analysis protocol, we used a two-fold strategy. First, we applied a false discovery rate (FDR) correction to account for multiple testing [48], and second, we used a spatial clustering filter to control for the size of the activation region [49]. The pair of post-hoc correction strategies used in this phase are designed to temper erratic noise (FDR), as well as, to annihilate smaller-in-size regions of activated voxels (clustering), which are more likely to represent sporadic random activations.

4. Applications

To compare classical space-time analytics vs. spacekime inference, we applied this three-phase tensor-based analytical protocol to the original spacetime fMRI finger-tapping task, as well as to its corresponding spacekime representation. Then we contrasted the statistical significance maps, i.e., computed and compared the p-values corresponding to the on-off stimulus. The results were visualized in terms of the ROI cluster size and the p-values corresponding to brain activation blocks within the specific ROIs, see Figure 4.

Figure 4.

Figure 4.

Results of the three-phase statistical analysis protocol on native fMRI time-series data (left) and the corresponding fMRI kimesurface representations (right). The 3D renderings illustrate different point-of-view orientations. Note: The Phase 1 results for the classical (left) and proposed space-kime analytics are identical, as the ROI-analytics are done on the raw space-time data.

The results on Figure 4 show some important advantages of our three-phase tensor-based analytical protocol. First, unlike traditional voxel-by-voxel statistical inference on fMRI data, our method takes spatial correlation into consideration by using tensor regression, which is more in line with the fact that the signal at a particular voxel is likely to be similar to the signals of nearby neighboring voxels. Second, this method substantially reduces the dimensionality of fMRI data, which leads to efficient estimation and prediction. In addition, the comparison of the results obtained using the time-series and kimesurface analyses clearly indicates that spacekime represented fMRI data can retain more information than the raw fMRI time-courses. This suggests that kimesurface representation and spacekime analytics may provide more reliable activation brain maps. The activated brain areas revealed by the tensor-based statistical analysis of the kimesurfaces associated with the finger-tapping task paradigm are consistent with the evidence of observing such activation in the somatosensory cortex (motor area), which validates the performance of the spacekime analytical strategy for modeling, inference and prediction based on fMRI signals. Appendix B provides some additional renditions of 3D scenes showing the results of each phase of the 3-tier spacekime analytical protocol.

5. Conclusions and Discussion

Introducing complex-time provides a foundation to address many interesting scientific challenges, including proposing novel techniques for interrogating time-varying longitudinal data, addressing some of the problems of time, and extending and generalizing mathematical equations describing natural laws of physics. The kime representation of longitudinal observations lifts the familiar notion of event order (time) to the complex plane. The induced spacekime representation expands the longitudinal dimension of time and generalizes the classical 4D universal spacetime to a 5D spacekime manifold whose topological structure facilitates novel data analytics. Spacekime representation of data can be used to design, test, and validate novel statistical models for risk estimation, probabilistic modeling, trajectory forecasting, parametric and non-parametric inference, as well as supervised and unsupervised artificial intelligence.

In this paper, we proposed discrete and continuous kimesurface representation strategies for transforming longitudinal data into kimesurfaces. Applying general tensor regression to the resulting kimesurfaces allows the comparison and contrasting of classical time-series analysis with tensor regression on kimesurfaces. Using fMRI data, we presented a three-phase tensor-based analytical protocol for the spacetime and spacekime represented data. We showed that this protocol substantially reduces the dimensionality of fMRI data, which leads to efficient estimation and prediction. We also showed that spacekime fMRI data representation can retain more information than spacetime represented fMRI data, thus revealing more activated brain areas. Therefore, one advantage of kimesurface is its ability to potentially expose supplemental information that may enhance traditional spacetime observation-based scientific inference, improve data-driven predictions, and refine evidence-based decision-making processes.

The main contribution of this article revolves around the introduction of a tensor-based linear model applicable for representation, analysis, and inference of repeated-measure longitudinal data following a transformation from the linear (positive real) time domain to the (complex) kime plane.

Two alternative strategies for mapping time-series data to kimesurface manifolds are presented and tested on neuroimaging fMRI data. This spacekime analytical technique offers a new mechanism to represent time-varying and high-dimensional datasets and provides a foundation for developing innovative ML/AI methods that capitalize on the intrinsic topological structure of the spacekime data manifolds. In addition to applications in biomedical sciences [50], econometrics [51], and environmental sciences [52], the spacekime representation and tensor analytics may be useful in a number of biophysics studies [53], thermoelastic diffusion solutions [54], and boundary value problems [55].

As firm supporters of open-science, we have documented, packaged and released the source-code in a time complexity and inferential uncertainty (TCIU) R-package. The source code, the TCIU package, vignettes and documentation are available on GitHub (https://github.com/SOCR/TCIU) and CRAN (https://cran.r-project.org/web/packages/TCIU/). More information, interactive demonstrations, and additional materials are available on the spacekime website (https://spacekime.org). Community input is always welcome.

Supplementary Material

1772272_fig_a
1772272_fig_b
1772272_fig_c
1772272_fig_d
1772272_fig_e
1772272_fig_f
1772272_fig_g
1772272_fig_h

6. Acknowledgments

This research is supported in part by funding provided by the National Science Foundation, the National Institutes of Health, and the Michigan Institute for Data Science. National Institute of Health (https://www.nih.gov) grants: UL1 TR002240, R01 CA233487, R01 MH121079, T32 GM141746; National Science Foundation (https://www.nsf.gov) grants: 1916425, 1734853, 1636840, 1416953, 0716055, 1023115. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Appendix

Appendix A. Introduction to Tensors

A tensor is an important mathematical object defined as a multilinear map from a vector space (module) over a fixed field (or ring) into the field (or ring). In the special case where the field is or and the vector space is finite dimensional, these multilinear maps coincide with the common computational and data science tensor definition as multidimensional arrays. The space of multidimensional arrays is then naturally isomorphic to the space of multilinear maps. In physical sciences, a tensor maps a set of K vector-space elements and M dual-space elements into the base field, F.

Tensor as multilinear mapabstract tensor product:V1××VKK elementsF,Tensor as multiway arraydata science:FI1×I2××IKTensorphysics:V1*××VK*M elementsin dual space×V1××VKK elements invector spaceF,

where F can be either or , Vi is a vector space over F, and Vi* is its dual, ∀ 1 ≤ iK. Throughout this paper, we refer to the data science definition of tensors as multiway arrays representing computable data objects tracking observed collections of multivariate features organized into K-way arrays. Common TLM notation uses lowercase bold symbols (a) for vectors, bold uppercase (A) for matrices, and uppercase blackboard symbols (B) as tensors of any dimension. Note that the base fields are also denoted by blackboard symbols (F). A Kth order tensor is a multidimensional array B:I1××IK with Ik representing the dimension of the kth mode of the tensor, ∀1 ≤ kK. The tensor elements are specified as B[i1,,iK], where 1 ≤ ikIK, ∀1 ≤ kK. Computable data tensor objects can be represented as outer products, °, of a set of vectors {bk}k=1K:

B=b1b2bK,

with its elements indexed as:

B[i1,,iK]=k=1Kbk[ik].

The outer product operation ° takes a pair of a K1th-order and a K2th-order tensors as inputs, and returns a (K1 + K2)th-order tensor. In general, transformation tensors can be considered as multilinear mappings between vector spaces. If F denotes the real () or complex () field including the tensor elements, then a data tensor is an object that can be represented as

FI1×I2××IKFI1FI2FIKvector space tensor productLmultilinear map(FI1×FI2××FIK,F).

The order of a given tensor, K, represents the number of tensor modes. For each tensor, there exists an integer R (min R is the tensor rank) that allows an expansion of the tensor as a linear combination of R rank-1 tensors:

vec(B)=r=1Rλr(bK,rbK1,rbk,rKronecker tensor productb1,r),
B=r=1Rλrb1,rb2,rbk,rbK,r,

where λrF, bk,rFIk, and 1 ≤ kK. The minimal min R that permits such representation is called the rank of the tensor and the associated tensor decomposition is called minimal CANDECOMP/PARAFAC (CP) or Polyadic tensor decomposition [7, 56].

The tensor vectorization operation vec(B) restructures (or vectorizes) the tensor as a 1D vector containing all tensor array entries. The length of the corresponding vec(B) is k=1KIk and the indexing transformation is defined by:

vec(B)[i1+k=2K{(l=1k1Il)(ik1)}]=B[ii,i2,,iK].

Tensors can also be unwounded as matrices by unfolding them along a specified tensor mode k. The mode-k matrix representation of the tensor B is B(k):Ik×lkIl and contains the vectorized representation of each sub-tensor in the kth mode.

Finally, for a pair of tensors A and B, we can define the contracted tensor product, A,BL, which naturally leads to tensor-based linear modeling, tensor inference, and linear tensor model prediction.

AI1××IK×P1××PL   and   B:P1××PL×Q1××QM,A,BL:I1××IK×Q1××QM,A,BL[i1,,iK;q1,,qM]=p1=1P1pL=1PLA[i1,,iK;p1,,pL]B[p1,,pL;q1,,qM].

Notice that in this example the contracted tensor product takes an order-(K + L) tensor and an order-(L + M) and returns a (K + M)-tensor, as opposed to (K + 2L + M)-tensor with an outer product. Specifically, the tensor product contraction clears the common dimensions shared by the two tensors.

Appendix B. Additional 3D Views of the Results

The composite figure below extends the results shown in Figure 4 of the paper by providing additional 3D views of different scenes of the results of each of the proposed 3-phase spacekime analytical protocol.

graphic file with name nihms-1772272-f0001.jpg

graphic file with name nihms-1772272-f0002.jpg

graphic file with name nihms-1772272-f0003.jpg

Appendix C. Implementation of Tensor-based linear modeling

I. Time Complexity and Inferential Uncertainty (TCIU) R-Package Loading

Appendix C.

II. Data Manipulation

Appendix C.

III. Three-phase ROI Analysis

a. Phase1: Detect Potential Activated ROI

III.

b. Phase2: ROI-Based Tensor-on-Tensor Regression

III.

c. Phase3: FDR Correction and Spatial Clustering.

III.

Appendix D. R Implementation of the Laplace Transformation

I. Discrete Laplace Transformation

Appendix D.

Appendix D.

Appendix D.

II. Continuous Inverse Laplace Transformation

Appendix D.

Appendix D.

III. fMRI Data Example

This example illustrates testing LT/ITL on real fMRI time-series. First download the data.

Appendix D.

Define the parametric sampling strategy in the complex plain ().

Appendix D.

Appendix D.

Apply the NuLT to fMRI time-series.

Appendix D.

Recover the function of time by applying the ctILT to the NuLT as analytic complex-valued LTF (function).

Appendix D.

Footnotes

Publisher's Disclaimer: This AM is a PDF file of the manuscript accepted for publication after peer review, when applicable, but does not reflect post-acceptance improvements, or any corrections. Use of this AM is subject to the publisher’s embargo period and AM terms of use. Under no circumstances may this AM be shared or distributed under a Creative Commons or other form of open access license, nor may it be reformatted or enhanced, whether by the Author or third parties. See here for Springer Nature’s terms of use for AM versions of subscription articles: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms

Conflict of Interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

References:

  • 1.Diggle P, et al. , Analysis of longitudinal data. 2002: Oxford University Press. [Google Scholar]
  • 2.Asar Ö, et al. , Joint modelling of repeated measurement and time-to-event data: an introductory tutorial. International journal of epidemiology, 2015. 44(1): p. 334–344. [DOI] [PubMed] [Google Scholar]
  • 3.Dinov I, Data Science and Predictive Analytics: Biomedical and Health Applications using R. Computer Science. 2018: Springer International Publishing. 800. [Google Scholar]
  • 4.Raftery AE, Madigan D, and Volinsky CT, Accounting for model uncertainty in survival analysis improves predictive performance. Bayesian statistics, 1996. 5: p. 323–349. [Google Scholar]
  • 5.Van der Laan MJ, Laan M, and Robins JM, Unified methods for censored longitudinal data and causality. 2003: Springer Science & Business Media. [Google Scholar]
  • 6.Dinov I and Velev M, Data Science: Time Complexity, Inferential Uncertainty, and Spacekime Analytics. 1 ed. STEM Series. 2021, Berlin/Boston: De Gruyter. 450 p. [Google Scholar]
  • 7.Zhou Y, Wong RKW, and He K, Tensor Linear Regression: Degeneracy and Solution. IEEE Access, 2021. 9: p. 7775–7788. [Google Scholar]
  • 8.Tang M, et al. , Model-Based and Model-Free Techniques for Amyotrophic Lateral Sclerosis Diagnostic Prediction and Patient Clustering. Neuroinformatics, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Marino S, et al. , Compressive Big Data Analytics: An Ensemble Meta-Algorithm for High-dimensional Multisource Datasets. PLoS, 2020. 15(8): p. e0228520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Nesselroade JR and Baltes PB, Longitudinal research in the study of behavior and development. 1979: Academic Press. [Google Scholar]
  • 11.Grimm KJ, Davoudzadeh P, and Ram N IV. Developments in the analysis of longitudinal data. Monographs of the Society for Research in Child Development, 2017. 82(2): p. 46–66. [DOI] [PubMed] [Google Scholar]
  • 12.Fitzmaurice GM and Ravichandran C, A primer in longitudinal data analysis. Circulation, 2008. 118(19): p. 2005–2010. [DOI] [PubMed] [Google Scholar]
  • 13.Telzer EH, et al. , Methodological considerations for developmental longitudinal fMRI research. Developmental cognitive neuroscience, 2018. 33: p. 149–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Taris T, Longitudinal data analysis. 2000: Sage. [Google Scholar]
  • 15.Wei WW, Time series analysis, in The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2. 2006. [Google Scholar]
  • 16.Verbeke G, et al. , The analysis of multivariate longitudinal data: a review. Statistical methods in medical research, 2014. 23(1): p. 42–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Perea J, Topological Time Series Analysis. Notices of the American Mathematical Society, 2019(5): p. 686–694. [Google Scholar]
  • 18.Boissonnat J-D, Chazal F, and Yvinec M, Geometric and topological inference. Vol. 57. 2018: Cambridge University Press. [Google Scholar]
  • 19.Kennedy SM, Roth JD, and Scrofani JW. A novel method for topological embedding of time-series data. in 2018 26th European Signal Processing Conference (EUSIPCO). 2018. IEEE. [Google Scholar]
  • 20.Bao W, Yue J, and Rao Y, A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PloS one, 2017. 12(7): p. e0180944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Friston KJ, et al. , Analysis of fMRI time-series revisited. Neuroimage, 1995. 2(1): p. 45–53. [DOI] [PubMed] [Google Scholar]
  • 22.Aron AR, Gluck MA, and Poldrack RA, Long-term test–retest reliability of functional MRI in a classification learning task. Neuroimage, 2006. 29(3): p. 1000–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Skup M, Longitudinal fMRI analysis: A review of methods. Statistics and its interface, 2010. 3(2): p. 235–252. [PMC free article] [PubMed] [Google Scholar]
  • 24.Kumar N, et al. , IEEE 12th International Conference on Computer Vision. 2009, IEEE. [Google Scholar]
  • 25.Lock EF, Tensor-on-tensor regression. Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America, 2018. 27(3): p. 638–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.De Martino F, et al. , Predicting EEG single trial responses with simultaneous fMRI and relevance vector machine regression. Neuroimage, 2011. 56(2): p. 826–36. [DOI] [PubMed] [Google Scholar]
  • 27.Acar E, et al. Tensor-based fusion of EEG and FMRI to understand neurological changes in schizophrenia. in 2017 IEEE International Symposium on Circuits and Systems (ISCAS). 2017. IEEE. [Google Scholar]
  • 28.Ferdowsi S, Abolghasemi V, and Sanei S, A new informed tensor factorization approach to EEG– fMRI fusion. Journal of neuroscience methods, 2015. 254: p. 27–35. [DOI] [PubMed] [Google Scholar]
  • 29.Belding WR, Bases for the Positive Cone of a Partially Ordered Module. Transactions of the American Mathematical Society, 1978: p. 305–313. [Google Scholar]
  • 30.Robinson DJ, An introduction to abstract algebra. 2008: de Gruyter. [Google Scholar]
  • 31.Anderson E, The Problem of Time. 2017: Springer. [Google Scholar]
  • 32.Spiegel MR, Laplace transforms. 1965: McGraw-Hill; New York. [Google Scholar]
  • 33.Adler RJ, The geometry of random fields. 2010: SIAM. [Google Scholar]
  • 34.Worsley K, Local maxima and the expected Euler characteristic of excursion sets of X2, F and T fields. Advances in Applied Probability, 1994. 26: p. 13–42. [Google Scholar]
  • 35.Valsa J and Brančik L, Approximate formulae for numerical inversion of Laplace transforms. International Journal of Numerical Modelling: Electronic Networks, Devices and Fields, 1998. 11(3): p. 153–166. [Google Scholar]
  • 36.Berenstein CA and Yger A, Residue calculus and effective Nullstellensatz. American Journal of Mathematics, 1999. 121(4): p. 723–796. [Google Scholar]
  • 37.Abramowitz M, Stegun IA, and Romer RH, Handbook of mathematical functions with formulas, graphs, and mathematical tables. 1988, American Association of Physics Teachers. [Google Scholar]
  • 38.Peltier SJ, et al. Support vector machine classification of complex fMRI data. in 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2009. [DOI] [PubMed] [Google Scholar]
  • 39.Peltier S, et al. Multivariate Classification of Complex and Multi-echo fMRI Data. in 2013 International Workshop on Pattern Recognition in Neuroimaging. 2013. [Google Scholar]
  • 40.Tabelow K and Polzehl J, Statistical parametric maps for functional MRI experiments in R: The package fmri. 2010: WIAS. [Google Scholar]
  • 41.Eloyan A, et al. , Analytic Programming with fMRI Data: A Quick-Start Guide for Statisticians Using R. PLOS ONE, 2014. 9(2): p. e89470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Shattuck DW, et al. , Construction of a 3D probabilistic atlas of human cortical structures. NeuroImage, 2008. 39(3): p. 1064–1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Mandal PK, Mahajan R, and Dinov ID, Structural Brain Atlases: Design, Rationale, and Applications in Normal and Pathological Cohorts. Journal of Alzheimer’s Disease, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Tang Y, et al. , The construction of a Chinese MRI brain atlas: A morphometric comparison study between Chinese and Caucasian cohorts. Neuroimage, 2010. 51(1): p. 33–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Welvaert M and Rosseel Y, On the definition of signal-to-noise ratio and contrast-to-noise ratio for fMRI data. PloS one, 2013. 8(11): p. e77089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Geissler A, et al. , Contrast-to-noise ratio (CNR) as a quality parameter in fMRI. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 2007. 25(6): p. 1263–1270. [DOI] [PubMed] [Google Scholar]
  • 47.Fay MP and Proschan MA, Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics surveys, 2010. 4: p. 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Benjamini Y and Hochberg Y, Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 1995. 57(1): p. 289–300. [Google Scholar]
  • 49.Lohmann G, et al. , LISA improves statistical analysis for fMRI. Nature Communications, 2018. 9(1): p. 4014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Wu CO and Tian X, Nonparametric models for longitudinal data: With implementation in R. 2018: CRC Press. [DOI] [PubMed] [Google Scholar]
  • 51.Van Montfort K, Oud JH, and Voelkle MC, Continuous time modeling in the behavioral and related sciences. 2018: Springer. [Google Scholar]
  • 52.Molenberghs G and Verbeke G, Models for Discrete Longitudinal Data. 2006: Springer; New York. [Google Scholar]
  • 53.Banus J, et al. , Biophysics-based statistical learning: Application to heart and brain interactions. Medical Image Analysis, 2021. 72: p. 102089. [DOI] [PubMed] [Google Scholar]
  • 54.Abbas IA and Marin M, Analytical Solutions of a Two-Dimensional Generalized Thermoelastic Diffusions Problem Due to Laser Pulse. Iranian Journal of Science and Technology, Transactions of Mechanical Engineering, 2018. 42(1): p. 57–71. [Google Scholar]
  • 55.Marin M, et al. , A domain of influence in the Moore–Gibson–Thompson theory of dipolar bodies. Journal of Taibah University for Science, 2020. 14(1): p. 653–660. [Google Scholar]
  • 56.Kolda TG and Bader BW, Tensor decompositions and applications. SIAM review, 2009. 51(3): p. 455–500. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1772272_fig_a
1772272_fig_b
1772272_fig_c
1772272_fig_d
1772272_fig_e
1772272_fig_f
1772272_fig_g
1772272_fig_h

RESOURCES