Introducing fastpos: A Fast R Implementation to Find the Critical Point of Stability for a Correlation

Introducing fastpos: A Fast R Implementation to Find the Critical Point of Stability for a Correlation

The R package fastpos provides a fast algorithm to estimate the required sample size for a Pearson correlation to stabilize (Schönbrodt and Perugini 2013). The stability approach is an innovative alternative to other means of sample size planning, such as power analysis. Although the approach is young, it has already attracted much interest in the research community. Still, to date, there exists no easy way to use the stability approach because there is no analytical solution and a simulation approach is computationally expensive with a quadratic time complexity. The presented package overcomes this limitation by speeding up the calculation of correlations and achieving linear time complexity. For typical parameters, the theoretical speedup is around a factor of 250, which was empirically confirmed in a comparison with the original implementation corEvol. This speedup allows practitioners to use the stability approach to plan for sample size and theoreticians to further explore the method.

Johannes Titz https://johannestitz.com (Chemnitz University of Technology)
2022-12-12

1 Sample size planning with the stability approach

Sample size planning is one of the most crucial steps before conducting an empirical study. The approach-avoidance conflict lies in the desire for reliable conclusions, but the unwillingness to spend resources for large samples. To balance benefit and cost there exist three more or less established paths: power analysis (e.g. Cohen 1988), accuracy in parameter estimation [AIPE; e.g. Maxwell et al. (2008)] and interval based accuracy methods (Algina and Olejnik 2003). Recently, a fourth way was introduced: stability (Schönbrodt and Perugini 2013). The general idea of this approach is to determine the sample size at which a certain percentage of studies will fall into an priori specified interval and stay in this interval if the sample size is increased further. For instance, if the population correlation is 0.5, one can define the limits to be 0.4 and 0.6. Given these constraints, what sample size is required to guarantee, with a certain probability (e.g. 90%), that the correlation coefficient will not drop below 0.4 or rise above 0.6 if more participants are added. This sample size is also referred to as the critical point of stability for the specific parameters. The stability approach is promising because it (1) focuses on the effect size instead of significance and (2) is fairly intuitive. Indeed, the interest in the method is growing, evident in more than 1500 citations of the original publication. But a proper software package for the stability approach is still missing.

When the concept was introduced, the authors presented a collection of R scripts (corEvol, available at a github repository: https://github.com/nicebread/corEvol) to derive a sample size table for certain parameters. This implementation is too slow to plan the sample size for an individual study as it can take hours to get reliable results. In this article a faster implementation of the stability approach is introduced available in the R package fastpos with the function find_critical_pos.

2 Model and implementations

The general model can be shortly described as follows: Define a population correlation \(\rho\), the corridor of stability with lower limit \(l\) and upper limit \(u\) and a confidence \(1-\alpha\). Now, pairs of values from a bivariate normal distribution with correlation \(\rho\) are drawn. In a first step \(n_\mathrm{min}\) pairs are drawn, to which, repeatedly, one more pair is added so that the sample size \(n\) is sequentially increased by 1. For every \(n\) the correlation \(r_n\) is calculated. The point of stability \(n_\mathrm{pos}\) can be described as:

\[\begin{equation} n_\mathrm{pos}=\min{\left\{ n\in \mathbb{N}|l\leq r_{m}\leq u,\forall m\geq n\right\}} \tag{1} \end{equation}\]

Meaning that the corridor of stability is not left again after the point of stability has been crossed and that the corridor of stability was just entered at the point of stability. Note that \(n_\mathrm{pos}\) is a random variable that has to be evaluated with respect to the normal bivariate distribution. The critical point of stability is the quantile \(1-\alpha\) of the probability density function of \(n_\mathrm{pos}\). It is possible to calculate the transition probabilities of entering, leaving or staying in the corridor of stability for two neighboring sample sizes \(n\) and \(n+1\). But, so far, no analytical solution to calculate the critical point of stability has been proposed.

Instead, Schönbrodt and Perugini (2013) set up a Monte Carlo simulation to produce a sample size table for some parameter combinations. In a simulation a maximum sample size \(n_\mathrm{max}\) has to be chosen. Then, for every \(n\) from \(n_\mathrm{min}\) to \(n_\mathrm{max}\) the correlation can be calculated. The point of stability for one simulation study can again be described by the above condition. From many of such studies, the critical point of stability can be estimated for the desired confidence.

In the original implementation, the correlations were calculated from scratch for each \(n\), using the function cor from stats. This is slow as several millions of correlations have to be calculated for a reliable estimate. The correlations at \(n\) and \(n+1\) only differ by one pair of values, which can be exploited for speed. Take the sum formula for the correlation coefficient at a specific sample size \(n\):

\[\begin{equation} r_{n} = \frac{n\sum_{i=1}^{n} x_i y_i - \sum_{i=1}^{n} x_i \sum_{i=1}^{n} y_i} {\sqrt{n\sum_{i=1}^{n} x_i^2-\left(\sum_{i=1}^{n} x_i\right)^2} \sqrt{n\sum_{i=1}^{n} y_i^2-\left(\sum_{i=1}^{n} y_i\right)^2}} \tag{2} \end{equation}\]

Several sums are calculated, each consisting of adding up \(n\) terms. In corEvol this is done for every sample size from the minimum to the maximum one. Thus, the total number of added terms for one sum is:

\[\begin{equation} \sum_{n=n_\mathrm{min}}^{n_\mathrm{max}}n = \sum_{n=1}^{n_\mathrm{max}}n - \sum_{n=1}^{n_\mathrm{min}-1}n = \frac{n_\mathrm{max}\left( n_\mathrm{max}+1\right)}{2} - \frac{\left( n_\mathrm{min}-1\right) \left( n_\mathrm{min}-1+1\right)}{2} \tag{3} \end{equation}\]

The variable \(n_\mathrm{min}\) can be ignored as it is usually a small value and could even be set to 2. Furthermore, the number of sums in the correlation formula will be the same for every algorithm and is a constant. Dropping constant factors and lower order terms, the time complexity of the described algorithm is \(\mathcal{O}(n_\mathrm{max}^2)\).

In contrast, fastpos calculates the correlation for the maximum sample size first. This requires to add \(n_\mathrm{max}\) numbers for one sum. Then it subtracts one value from this sum to find the correlation for the sample size \(n_\mathrm{max}-1\), which happens repeatedly until the minimum sample size is reached (or the corridor is left). In the worst case, the total number of terms for one sum amounts to:

\[\begin{equation} n_\mathrm{max}+n_\mathrm{max}-n_\mathrm{min} \tag{4} \end{equation}\]

Again, dropping constant factors and lower order terms, the time complexity of this algorithm is \(\mathcal{O}(n_\mathrm{max})\). The ratio between the two approaches is:

\[\begin{equation} \frac{n_\mathrm{max}\left(n_\mathrm{max}+1\right) -\left(n_\mathrm{min}-1\right)n_\mathrm{min}}{4n_\mathrm{max}-2n_\mathrm{min}} \tag{5} \end{equation}\]

For the typically used \(n_\mathrm{max}\) of 1,000 and \(n_\mathrm{min}\) of 20, a speedup of about 250 can be expected. From a theoretical perspective it is also interesting to study the stability approach with larger values of \(n_\mathrm{max}\), for which the difference becomes even more pronounced.

The theoretical speedup is only an approximation for several reasons. First, one can stop the algorithm when the corridor is left the first time, which is done in fastpos but not in corEvol. Second, the main function of fastpos was written in C++ (via Rcpp, Eddelbuettel et al. 2022), which is much faster than normal R. At the same time, the algorithms contain many more steps than just calculating correlations. For instance, setting up the population with a specific \(\rho\) takes some time since it usually consists of a million value pairs. The interface functions to setup the simulations also play a role, especially when the algorithm itself is very fast. Thus, it is necessary to study the speed benefit empirically. But before running a benchmark it will be useful to show (1) how to use fastpos in general and (2) that it produces the same estimates as corEvol.

3 How to use fastpos

For a simple illustration, imagine you plan an empirical study and believe the population correlation is 0.6. You would be happy to find a stable correlation between 0.5 and 0.7 with a probability of 80%. What this means is that there is an 80% chance of finding a correlation between 0.5 and 0.7 and by adding more participants this corridor is not left again. In fastpos you can run:

library(fastpos)
set.seed(20200219)
find_critical_pos(rho = 0.6, precision_absolute = 0.1, confidence_levels = .8,
                  sample_size_min = 20, sample_size_max = 1e3, n_studies = 1e4)
  rho_pop pos.80% sample_size_min sample_size_max lower_limit
1     0.6     104              20            1000         0.5
  upper_limit n_studies n_not_breached precision_absolute
1         0.7     10000              0                0.1
  precision_relative
1                 NA

This loads the package, sets a seed for reproducibility, and runs the simulation with default parameters (except for the ones specifically set). A progress bar is displayed if run in interactive mode. The result is a critical point of stability of 104.

The main function of the package find_critical_pos will usually suffice for most use cases. Its parameters are documented in detail in the package. The population correlation (rho) and the number of simulation studies (n_studies) is self-explanatory. The chosen precision (precision_absolute) of 0.1 (i.e. the half-width) will result in the desired corridor between 0.5 and 0.7. There is also a convenience argument to set the precision as a relative value, precision_relative, which will override precision_absolute. For instance, precision_relative = 0.1 produces an interval of \(\rho \pm \rho \cdot 0.1\). Alternatively, one can also provide the lower and upper limit of the corridor directly via lower_limit and upper_limit. This is especially useful if the corridor is not symmetric. Notable, most parameters can also take vectors so it is possible to run multiple simulations for different rho values (and corresponding other parameters) at once.

The parameter confidence_levels defines the quantile corresponding to the critical point of stability. This parameter can be a single value or a vector, but is fixed for all rho values. If different confidence levels are of interest, providing them as a vector saves a lot of resources because one simulation can be used to calculate the critical points of stability for all confidence levels.

The parameters sample_size_min and sample_size_max set the minimum and maximum sample size of one simulation study. As in corEvol they default to 20 and 1,000. This means a sample of 20 observation pairs is drawn from the population and step by step one more observation is added until the sample size of 1,000 is reached.

The output summarizes the individually set (and default) parameters as well as the critical point of stability of about 104. The value will change slightly from run to run because only 10,000 simulations are done here. In practice one can make a quick estimate with the default parameters and then increase the number of simulation studies for a more robust estimate. Under GNU/Linux one can also take advantage of the multicore support (parameter n_cores). This functionality is currently implemented via the pbmcapply package (Kuang et al. 2022), which is based on parallel.1

For another illustration let us reproduce Schönbrodt and Perugini (2013)’s oft-cited table of the critical points of stability for an absolute precision of 0.1 (meaning that the corridor will be \(\rho\pm.1\)). We take advantage of the vectorized input option by providing several \(\rho\) values at once. Furthermore, we increase the number of studies to 100,000 to get accurate estimates. To cache the simulation results we use simpleCache (Nagraj and Sheffield 2021):

library(simpleCache)
setCacheDir("titz_cache")
simpleCache("sim2", {find_critical_pos(rho = seq(.1, .7, .1), n_studies = 1e5)})
sim2
  rho_pop pos.80% pos.90% pos.95% sample_size_min sample_size_max
1     0.1     253     363     478              20            1000
2     0.2     237     339     448              20            1000
3     0.3     212     305     404              20            1000
4     0.4     181     262     343              20            1000
5     0.5     143     208     277              20            1000
6     0.6     103     150     200              20            1000
7     0.7      65      96     129              20            1000
  lower_limit upper_limit n_studies n_not_breached precision_absolute
1         0.0         0.2     1e+05            139                0.1
2         0.1         0.3     1e+05            102                0.1
3         0.2         0.4     1e+05             43                0.1
4         0.3         0.5     1e+05             15                0.1
5         0.4         0.6     1e+05              5                0.1
6         0.5         0.7     1e+05              0                0.1
7         0.6         0.8     1e+05              0                0.1
  precision_relative
1                 NA
2                 NA
3                 NA
4                 NA
5                 NA
6                 NA
7                 NA

The results are very close to the original publication (Schönbrodt and Perugini 2013). Note that a warning is shown because in some simulations the point of stability was not found. This is not too surprising as one can easily imagine an extreme outlier study that, for instance, starts at a negative correlation with \(n=20\) and does not reach the specified corridor of stability at the maximum sample size of \(n=1,000\). There are different ways to handle these outliers, which will affect the estimate.

4 Handling outliers

When comparing the table from above with the one in Schönbrodt and Perugini (2013), one should notice that fastpos usually produces larger estimates. To illustrate this more reliably we need to increase the number of studies, so that random fluctuations are minimized. Here we will run 100 simulations with 1,000,000 studies each.2

simpleCache("sim3", {find_critical_pos(rho = rep(0.1, 100),
                                       sample_size_max = 1e3, n_studies = 1e6)})

A good summary of the data is the mean and the standard error of the distribution. Before calculating these statistics, we select only the points of stability from the result:

sim3 <- sim3[, c("pos.80%", "pos.90%", "pos.95%")]
colMeans(sim3)
 pos.80%  pos.90%  pos.95% 
253.2020 363.2100 477.5905 
round(apply(sim3, 2, sd), 3)
pos.80% pos.90% pos.95% 
  0.603   0.729   1.035 

The average estimates are 253, 363 and 478 (with reasonably small standard errors), while in Schönbrodt and Perugini (2013) they are 252, 362, 470 and in Schönbrodt and Perugini (2018) 252, 360 and 474. Note that in every case fastpos gives a slightly larger estimate, which is not just a random fluctuation but related to the warning. In corEvol, if the corridor of stability is not reached, the respective study is ignored when calculating the critical point of stability. This leads to a systematic underestimation of the critical point of stability.

To illustrate this, we can use the lower level functions create_pop and simulate_pos to create a distribution of points of stability. In the following, the first line creates a population with a specific correlation and the second line produces several points of stability by drawing from this population. In contrast to the main function of the package (find_critical_pos), the function simulate_pos does not calculate quantiles, but only generates points of stability.

pop <- create_pop(rho = 0.1, size = 1e6)
simpleCache("sim4", {simulate_pos(x_pop = pop[, 1], y_pop = pop[, 2],
                                  n_studies = 1e6, sample_size_min = 20,
                                  sample_size_max = 1e3, replace = TRUE,
                                  lower_limit = 0, upper_limit = 0.2,
                                  progress = FALSE)})

There are two ways to calculate the quantiles of interest:

quantile(sim4, c(.8, .9, .95), na.rm = TRUE)
80% 90% 95% 
252 361 473 
sim4b <- ifelse(is.na(sim4), 1e3, sim4)
quantile(sim4b, c(.8, .9, .95))
80% 90% 95% 
253 363 478 

In the first calculation, the studies that did not reach the corridor of stability are ignored (like in corEvol), while in the second calculation it is assumed that the point of stability was reached at the maximum sample size. When repeating this simulation, the values will vary slightly but the second method will never produce smaller estimates. That the second method is more accurate can be tested by increasing the maximum sample size (to avoid studies that do not reach the corridor of stability). Here, we will set the maximum sample size to 5,000:

simpleCache("sim5", {find_critical_pos(rho = rep(0.1, 100),
                                       sample_size_max = 5e3,
                                       n_studies = 1e6)})
sim5 <- sim5[, c("pos.80%", "pos.90%", "pos.95%")]
colMeans(sim5)
 pos.80%  pos.90%  pos.95% 
253.3100 363.5100 478.4305 
round(apply(sim5, 2, sd), 3)
pos.80% pos.90% pos.95% 
  0.631   0.870   1.266 

If every study reaches the point of stability, the estimates are 253, 364 and 478. When the maximum sample size is too small (as in the second to last simulation), fastpos is indeed closer to these estimates than corEvol. While the difference to corEvol might seem practically negligible, corEvol’s estimates are clearly biased. Furthermore, depending on the parameters, the problem can become more severe. A very narrow corridor will lead to many studies not reaching the corridor, which corEvol will not even notice. On the other hand, fastpos will throw a warning, which should be taken seriously.

But even fastpos might underestimate the critical point of stability if the maximum sample size is too small: All estimates with a maximum sample size of 5,000 are slightly larger than the ones with a maximum sample size of 1,000. With a larger maximum sample size, there are more opportunities to leave the corridor again. At some point the probability of this event is very low because the corridor limits are too far away, but the probability is not 0. Thus, increasing the maximum sample size even further (here to 10,000) should lead to slightly larger estimates:

simpleCache("sim6", {find_critical_pos(rho = rep(0.1, 100),
                                       sample_size_max = 1e4, 
                                       n_studies = 1e6)})
sim6 <- sim6[, c("pos.80%", "pos.90%", "pos.95%")]
colMeans(sim6)
 pos.80%  pos.90%  pos.95% 
253.4000 363.7000 478.7005 
round(apply(sim6, 2, sd), 3)
pos.80% pos.90% pos.95% 
  0.667   0.937   1.234 

Indeed, all estimates are slightly larger but after rounding to a whole number only for the confidence of 95% the critical point of stability changes from 478 to 479. Furthermore, the randomness of the simulations permits such fluctuations since the standard errors are about 1. But note that all estimates increase when the maximum sample size changes from 1,000 to 5,000 and then to 10,000, which is a clear hint for a bias. Nonetheless, it appears unlikely that the estimates would increase much, when the maximum sample size grows further. The remaining problem is that the theoretical idea of stability assumes an infinite maximum sample size or, at least, that the maximum sample size is equal to the population size. It is therefore of some technical and practical interest to investigate the relationship between the maximum sample size and the critical point of stability in a dedicated simulation study with fastpos. Such a study would not be easy to approach with corEvol because of the quadratic time complexity. In the next section the speed difference between both packages is demonstrated empirically in a benchmark.

5 Benchmark

corEvol was written as a script for a simulation study and cannot be simply called via a function in a package. Thus, a helper function will be used that sources the script files. To make the benchmark reproducible, the original repository corEvol was forked and a benchmark branch created. With git and a shell installed, the following tries to update the repository in the corEvol folder. If this is unsuccessful (the folder does not exist), the repository is cloned.

git -C corEvol pull || git clone --single-branch --branch benchmark \
  https://github.com/johannes-titz/corEvol

Alternatively, you can download the required files from the supplementary material of this article.

For corEvol, two files are sourced for the benchmark. The first file generates the simulations and the second calculates the critical point of stability. In corEvol a simulation run takes a lot of time and thus it is not practical to run it too many times. But since the expected speed difference between both implementations is substantial, this should not be a concern. Here, ten repetitions were done with the microbenchmark (Mersmann 2021) package. The code was run on a Dell Server r6515 with an AMD EPYC 7302P CPU. Only one core was used to not confound the result with the specific parallel implementation.

library(microbenchmark)
corevol <- function() {
  setwd("corEvol")
  source("01-simdata.R")
  source("02-analyse.R")
  setwd("../")
}
fastpos <- function() {
  find_critical_pos(rho = .1, sample_size_max = 1e3, n_studies = 1e4,
                    progress = FALSE)
}
simpleCache("bm", {microbenchmark(corevol = corevol(), fastpos = fastpos(),
                                  times = 10, unit = "s")})
summary(bm)
     expr         min         lq        mean     median          uq
1 corevol 350.4551133 352.642384 355.7306834 355.221224 358.2854179
2 fastpos   0.5692708   0.579922   0.6215005   0.596842   0.6066209
          max neval cld
1 365.8751542    10   b
2   0.8496276    10  a 

For the chosen parameters, fastpos is about 572 times faster than corEvol, for which there are two main reasons: (1) fastpos is built around a C++ function via Rcpp and (2) this function does not calculate every calculation from scratch, but only calculates the difference between the correlation at sample size \(n\) and \(n-1\) via the sum formula of the Pearson correlation (see Equation (2)). There are some other factors that might play a role, but they cannot account for the large difference found. For instance, setting up a population takes quite long in corEvol (about 17s), but compared to the 6 minutes required overall, this is only a small fraction. There are other parts of the corEvol code that are fated to be slow, but again, a speedup by a factor of 572 cannot be achieved by improving these parts. The presented benchmark is not comprehensive, but still demonstrates that fastpos can be used with no significant waiting time for a typical scenario, while for corEvol this is not the case.

Another benchmark on a local i5-3320 2.6 GHz CPU from 2012 resulted in means of 1.5s for fastpos and 603s for corEvol giving a speedup of around 400. Thus, even on older CPUs and single-cored fastpos delivers almost instantly for default parameters.

6 Other effect sizes

The focus of fastpos is on the Pearson correlation as the effect size. In principle the stability approach can be extended to all sorts of effect sizes or even other statistical parameters. Since the original authors studied the Pearson correlation, it made sense to improve the algorithm for this specific use case. But mathematical shortcuts as in Equation (2) should also exist for other effect sizes and might be implemented in the future.

A simple alternative for applying the method to other effect sizes is to convert these effects to the Pearson correlation. Such conversions are very common in meta-analyses, where a consistent effect size must be used across all studies to calculate a meaningful average effect. Standard approximate conversion formulas can be found in text books on research methods (Sedlmeier and Renkewitz 2018; Borenstein et al. 2021). Several packages in R also provide these conversions. For instance, effectsize (Ben-Shachar et al. 2020) includes the functions d_to_r and r_to_d. d_to_r is based on the approximation \(r = \frac{d}{d^2+4}\), which should only be used for equal group sizes. As an example, consider \(d=0.5\) between two equally sized groups and a corridor with limits of 0.4 and 0.6.

r <- effectsize::d_to_r(0.5)
lower_limit <- effectsize::d_to_r(0.4)
upper_limit <- effectsize::d_to_r(0.6)
simpleCache("sim7", {find_critical_pos(rho = r, sample_size_max = 11e3,
                                       n_studies = 1e5,
                                       lower_limit = lower_limit,
                                       upper_limit = upper_limit)})
sim7
    rho_pop pos.80% pos.90% pos.95% sample_size_min sample_size_max
1 0.2425356    1119    1606    2108              20           11000
  lower_limit upper_limit n_studies n_not_breached precision_absolute
1   0.1961161   0.2873479     1e+05              0                 NA
  precision_relative
1                 NA

The corresponding Pearson correlation for \(d=0.5\pm0.1\) is about 0.24, with very narrow and slightly asymmetric limits (0.20 to 0.29). The critical point of stability is 2108 for a confidence level of 95%.

7 Summary

In this article, fastpos, a package for estimating the critical point of stability was introduced. The package is much faster than the original implementation and can be conveniently used for sample size planning as well as Monte Carlo simulation studies. While the original implementation ignores studies that do not reach the corridor of stability, fastpos takes them into account and gives a more conservative and more accurate estimate (i.e. a larger critical point of stability). From a practitioner’s perspective, this detail might be negligible for typical parameters and relatively wide corridors. But from a statistical perspective, this detail is of relevance and further simulation studies are required to better understand the stability approach in general. Finally, a comparison to other methods of sample size planning would be of much interest and could influence how empirical scientists plan for sample size in the future. fastpos can be a useful tool to achieve these goals.

8 Acknowledgment

I want to thank Matthias Hörr and Thomas Schäfer for insightful discussions about the stability approach. Furthermore, I want to thank an anonymous reviewer for many helpful suggestions on how to improve the article and the package.

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2022-041.zip

CRAN packages used

fastpos, Rcpp, pbmcapply, simpleCache, microbenchmark, effectsize

CRAN Task Views implied by cited packages

HighPerformanceComputing, MetaAnalysis, NumericalMathematics

J. Algina and S. Olejnik. Sample size tables for correlation analysis with applications in partial correlation and multiple regression analysis. Multivariate Behavioral Research, 38: 309–323, 2003. URL https://doi.org/10.1207/S15327906MBR3803_02.
M. S. Ben-Shachar, D. Lüdecke and D. Makowski. effectsize: Estimation of effect size indices and standardized parameters. Journal of Open Source Software, 5(56): 2815, 2020. URL https://doi.org/10.21105/joss.02815.
M. Borenstein, L. V. Hedges, J. P. Higgins and H. R. Rothstein. Introduction to meta-analysis. John Wiley & Sons, 2021.
J. Cohen. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988.
D. Eddelbuettel, R. Francois, J. Allaire, K. Ushey, Q. Kou, N. Russell, I. Ucar, D. Bates and J. Chambers. Rcpp: Seamless r and c++ integration. 2022. URL https://CRAN.R-project.org/package=Rcpp. R package version 1.0.9.
K. Kuang, Q. Kong and F. Napolitano. pbmcapply: Tracking the progress of mc*pply with progress bar. 2022. URL https://CRAN.R-project.org/package=pbmcapply. R package version 1.5.1.
S. E. Maxwell, K. Kelley and J. R. Rausch. Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology, 59: 537–563, 2008. URL https://doi.org/10.1146/annurev.psych.59.103006.093735.
O. Mersmann. microbenchmark: Accurate timing functions. 2021. URL https://CRAN.R-project.org/package=microbenchmark. R package version 1.4.9.
V. Nagraj and N. Sheffield. simpleCache: Simply caching r objects. 2021. URL https://CRAN.R-project.org/package=simpleCache. R package version 0.4.2.
F. D. Schönbrodt and M. Perugini. At what sample size do correlations stabilize? Journal of Research in Personality, 47: 609–612, 2013. URL https://doi.org/10.1016/j.jrp.2013.05.009.
F. D. Schönbrodt and M. Perugini. Corrigendum to At What Sample Size Do Correlations Stabilize?” [J. Res. Pers. 47 (2013) 609–612]. Journal of Research in Personality, 74: 194, 2018. URL https://doi.org/10.1016/j.jrp.2018.02.010.
P. Sedlmeier and F. Renkewitz. Forschungsmethoden und Statistik für Psychologen und Sozialwissenschaftler. 3rd ed Hallbergmoos, Germany: Pearson Studium, 2018.

  1. The multicore support will not be demonstrated here because it is difficult to create reproducible examples across different operating systems and number of cores.↩︎

  2. It is worth noting that (with a single core) this simulation would take several weeks to complete with corEvol but only takes about 66 minutes with fastpos.↩︎

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Titz, "Introducing fastpos: A Fast R Implementation to Find the Critical Point of Stability for a Correlation", The R Journal, 2022

BibTeX citation

@article{RJ-2022-041,
  author = {Titz, Johannes},
  title = {Introducing fastpos: A Fast R Implementation to Find the Critical Point of Stability for a Correlation},
  journal = {The R Journal},
  year = {2022},
  note = {https://doi.org/10.32614/RJ-2022-041},
  doi = {10.32614/RJ-2022-041},
  volume = {14},
  issue = {3},
  issn = {2073-4859},
  pages = {5-12}
}