Optimal sampling strategies in the coupon collector’s problem with unknown population size | Annals of Operations Research
Skip to main content

Optimal sampling strategies in the coupon collector’s problem with unknown population size

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

In doing a six-sigma analysis of a process one must first determine the set of possible factors that potentially drive the response of interest. This stage of the the work, known as process mapping, is time consuming. Spending too much time on it wastes investigators and employees time. Yet, spending too little time may result in failing to uncover important factors that drive the process. We model this situation as a general coupon collector’s problem with N distinct coupons, where the exact value of N is not known. Our objective is to devise effective strategies to minimize the total cost incurred due to sampling in addition to the cost of unidentified coupons when the collector stops sampling. We propose a policy based on an asymptotic analysis when N is large and prove that the proposed policy is asymptotically optimal. We also illustrate the effectiveness of this policy with numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Billingsley, P. (1999). Convergence of probability measures. New York, NY: Wiley.

    Book  Google Scholar 

  • Boneh, A., & Hofri, M., (1997). The coupon-collector problem revisited—a survey of engineering problems and computational methods. Communications in Statistics—Stochastic Models, 13, 39–66.

    Article  Google Scholar 

  • Boneh, S., & Papanicolaou, V. G., (1996). General asymptotic estimates for the coupon collector problem. Journal of Computational and Applied Mathematics, 67, 277–289.

    Article  Google Scholar 

  • Bunge, J., & Fitzpatrick, M., (1993). Estimating the number of species: A review. Journal of the American Statistical Association, 88(421), 364–373.

    Google Scholar 

  • Finkelstein, M., Tucker, H. G., & Veeh, J. A., (1998). Confidence intervals for the number of unseen types. Statistics and Probability Letters, 37, 423–430.

    Article  Google Scholar 

  • Gadrich, T., & Ravid, R., (2008). The coupon collector problem in statistical quality control. In S. I. Ao, L. Gelman, D. Hukins, A. Hunter, & A. M.Korsunsky (Eds.), Proceedings of the world congress on engineering (vol. 2, pp. 905–909). London: International Association of Engineers, Newswood Limited.

  • Holst, L. (1972). Asymptotic normality in a generalized occupancy problem. Probability Theory and Related Fields, 21, 109–120.

    Google Scholar 

  • Holst, L. (1973). Some limit theorems with applications in sampling theory. The Annals of Statistics, 1, 644–658.

    Article  Google Scholar 

  • Kan, N. D., & Nevzorov, V. B. (2005). On some limit laws in the coupon collector’s problem. Journal of Mathematical Sciences, 128, 2564–2568.

    Article  Google Scholar 

  • Luko, S. N. (2009). The “coupon collector’s problem” and quality control. Quality Engineering, 21, 168–181.

    Article  Google Scholar 

  • Pande, P., & Holpp, L. (2001). What is six sigma?. New York, NY: McGraw-Hill.

    Google Scholar 

  • Rosen, B. (1969). Asymptotic normality in a coupon collector’s problem. Probability Theory and Related Fields, 13, 256–279.

    Google Scholar 

  • Rosen, B. (1970). On the coupon collector’s waiting time. The Annals of Mathematical Statistics, 41, 1952–1969.

    Article  Google Scholar 

  • Sen, P. K. (1979). Invariance principles for the coupon collector’s problem: A martingale approach. The Annals of Statistics, 7, 372–380.

    Article  Google Scholar 

  • Stadje, W. (1990). The collector’s problem with group drawings. Advances in Applied Probability, 22, 866–882.

    Article  Google Scholar 

  • Whiting, P., Dupuis, P., & Nuzman, C. (2004). Large deviation asymptotics for occupancy problems. Annals of Probability, 32, 2765–2818.

    Article  Google Scholar 

  • Yahav, J., & Bickel P.J. (1985). On estimating the number of unseen species—how many executions were there? Technical Report No. 43. UC Berkeley, CA: Department of Statistics.

  • Zhang, J. X., & Dupuis, P. (2008). Large-deviation approximations for general occupancy models. Combnatorics Probability and Computing, 17(3), 437–470.

    Google Scholar 

Download references

Acknowledgements

Tolga Tezcan’s research is supported by National Science Foundation Grants CMMI-0954126 and CMMI-1130266.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tolga Tezcan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dobson, G., Tezcan, T. Optimal sampling strategies in the coupon collector’s problem with unknown population size. Ann Oper Res 233, 77–99 (2015). https://doi.org/10.1007/s10479-014-1563-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-014-1563-0

Keywords