The Analysis of Bounded Count Data in Criminology | Journal of Quantitative Criminology Skip to main content
Log in

The Analysis of Bounded Count Data in Criminology

  • Original Paper
  • Published:
Journal of Quantitative Criminology Aims and scope Submit manuscript

Abstract

Background

Criminological research utilizes several types of delinquency scales, including frequency counts and, increasingly, variety scores. The latter counts the number of distinct types of crimes an individual has committed. Often, variety scores are modeled via count regression techniques (e.g., Poisson, negative binomial), which are best suited to the analysis of unbounded count data. Variety scores, however, are inherently bounded.

Methods

We review common regression approaches for count data and then advocate for a different, more suitable approach for variety scores—binomial regression, and zero-inflated binomial regression, which allow one to consider variety scores as a series of binomial trials, thus accounting for bounding. We provide a demonstration with two simulations and data from the Fayetteville Youth Study.

Conclusions

Binomial regression generally performs better than traditional regression models when modeling variety scores. Importantly, the interpretation of binomial regression models is straightforward and related to the more familiar logistic regression. We recommend researchers use binomial regression models when faced with variety delinquency scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. In the situation where the units of analysis have varying times of exposure, an “offset” (constant) can be added to the equation to adjust for time at risk if it is not the same for all cases. Alternatively, if spatial units have different populations at risk, then the offset would be a count of the number of people residing in each area (see, e.g., Skrondal and Rabe-Hesketh 2004; Osgood 2000).

  2. Variety scales can also be easily constructed from frequency items by dichotomizing each item to a yes/no format. Frequency scales are the prototypical count measure appropriate for Poisson and negative binomial regression models and ostensibly contain more information than variety scales, which do not take into account the number of each act committed. However, frequency scales are more prone to skewness and inflation (or overweighting) of less-serious acts. In a study of scaling in criminological research, Sweeten (2012: 552) found that “Of the non-dichotomous scales assessed in this paper, frequency scales are the most problematic. This scale is very sensitive to high frequency items… Frequency scales are very strongly skewed." He concluded that variety scales were the best alternative to more complex formulations such as Item Response Theory theta measures (see Osgood et al. 2002). Research has also indicated that the psychometric properties of variety scales are superior to simple frequency measures (see Bendixen et al. 2003; Hindelang et al. 1981). Bendixen et al. (2003), comparing frequency scales to variety scores, found that the latter had stronger reliability (both internal and test–retest) and convergent validity (relationships with correlates of crime). Thus, the use of variety scores is recommended when combining items into an overall crime/delinquency scale.

  3. Since the data were generated without overdispersion, there was no need for the illustrations here to estimate negative binomial or zero-inflated models.

  4. Note, the Fayetteville Youth Study codebook indicates “Importance of grades” is coded A-very important (lower values) to D-completely unimportant (higher values). The original data for the analyses are no longer available but it is reasonable to assume this item was reverse coded.

  5. We also estimated negative binomial models. Consistent with Berk and MacDonald’s (2008) findings, when we estimated a negative binomial model with no covariates or a limited slate of covariates—clearly misspecified models—there was evidence of overdispersion. For example, omitting the item on number of friends picked up by the police from the model resulted in evidence of overdispersion, but once it was included the estimate of dispersion shrunk significantly. Once we included all nine covariates, the estimate of the dispersion coefficient was not significantly different from 0. While we make no claims that the simple model included here is fully specified—it clearly is not—these results do suggest that the use of the negative binomial model is sensitive to the covariates included in the analysis. More importantly, without evidence of overdispersion, the results were identical to the Poisson or zero-inflated Poisson and are therefore not presented here.

  6. Note, in Stata, binomial regression can be implemented by using the GLM command, with a logit link function, the binomial family, and specifying the number of trials. For example “glm DV IV, family(binomial #trials) link(logit)”.

References

  • Bendixen M, Endresen IM, Olweus D (2003) Variety and frequency scales of antisocial involvement: which one is better? Legal Criminol Psychol 8:135–150

    Article  Google Scholar 

  • Berk R, MacDonald JM (2008) Overdispersion and Poisson regression. J Quant Criminol 24:269–284

    Article  Google Scholar 

  • Burt CH, Simons RL (2013) Self-control, thrill seeking, and crime motivation matters. Crim Justice Behav 40:1326–1348

    Article  Google Scholar 

  • Cameron C, Trivedi P (1998) Models for count data. Cambridge University Press, New York

    Google Scholar 

  • Cheng SL, Micheals R, Lu ZQJ (2010) Comparison of confidence intervals for large operational biometric data by parametric and non-parametric methods. NISTIR 7740. U. S. Department of Commerce, National Institute of Standards and Technology. http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=906844

  • Decker S, Katz C, Webb VJ (2008) Understanding the black box of gang organization. Crime Delinq 54:153–172

    Article  Google Scholar 

  • Elliott DS, Ageton SS (1980) Reconciling race and class differences in self-reported and official estimates of delinquency. Am Sociol Rev 45:95–110

    Article  Google Scholar 

  • Esbensen FA, Osgood DW, Peterson D, Taylor TJ, Carson DC (2013) Short-and long-term outcome results from a multisite evaluation of the GREAT program. Criminol Public Policy 12:375–411

    Article  Google Scholar 

  • Hilbe JM (2011) Negative binomial regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Hindelang MJ, Hirschi T, Weis JG (1979) Correlates of delinquency: the illusion of discrepancy between self-report and official measures. Am Sociol Rev 44:995–1014

    Article  Google Scholar 

  • Hindelang MJ, Hirschi T, Weis JG (1981) Measuring delinquency. Sage Publications, Beverly Hills

    Google Scholar 

  • Hirschi T (1969) Causes of delinquency. University of California Press, Berkeley

    Google Scholar 

  • Long JS (1997) Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences. Sage, Thousand Oaks

    Google Scholar 

  • Lussier P, LeBlanc M, Proulx J (2005) The generality of criminal behavior: a confirmatory factor analysis of the criminal activity of sex offenders in adulthood. J Crim Justice 33:177–189

    Article  Google Scholar 

  • MacDonald JM, Lattimore PK (2010) Count models in criminology. In: Piquero AR, Weisburd D (eds) Handbook of quantitative criminology. Springer, New York, pp 683–698

    Chapter  Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models. Chapman & Hall, New York

    Book  Google Scholar 

  • Osgood DW (2000) Poisson-based regression analysis of aggregate crime rates. J Quant Criminol 16:21–43

    Article  Google Scholar 

  • Osgood DW, McMorris BJ, Potenza MT (2002) Analyzing multiple-item measures of crime and deviance I: item response theory scaling. J Quant Criminol 18:267–296

    Article  Google Scholar 

  • Skrondal A, Rabe-Hesketh S (2004) Generalized latent variable modeling: multilevel, longitudinal, and structural equation models. CRC Press, Boca Raton

    Book  Google Scholar 

  • Sweeten G (2012) Scaling criminal offending. J Quant Criminol 28:533–557

    Article  Google Scholar 

  • Sweeten G, Piquero AR, Steinberg L (2013) Age and the explanation of crime, revisited. J Youth Adolesc 42:921–938

    Article  Google Scholar 

  • Tittle CR, Villemez WJ, Smith DA (1978) The myth of social class and criminality: an empirical assessment of the empirical evidence. Am Sociol Rev 43:643–656

    Article  Google Scholar 

  • Welch MR, Tittle CR, Yonkoski J, Meidinger N, Grasmick HG (2008) Social integration, self-control, and conformity. J Quant Criminol 24:73–92

    Article  Google Scholar 

  • Winkelmann R (2008) Econometric analysis of count data. Springer, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Rocque.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Britt, C.L., Rocque, M. & Zimmerman, G.M. The Analysis of Bounded Count Data in Criminology. J Quant Criminol 34, 591–607 (2018). https://doi.org/10.1007/s10940-017-9346-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10940-017-9346-9

Keywords

Navigation