Abstract
The importance of data science and big data analytics is growing very fast as organizations are gearing up to leverage their information assets to gain competitive advantage. The flexibility offered through big data analytics empowers functional as well as firm-level performance. In the first phase of the study, we attempt to analyze the research on big data published in high-quality business management journals. The analysis was visualized using tools for big data and text mining to understand the dominant themes and how they are connected. Subsequently, an industry-specific categorization of the studies was done to understand the key use cases. It was found that most of the existing research focuses majorly on consumer discretionary, followed by public administration. Methodologically, a major focus in such exploration is in social media analytics, text mining and machine learning applications for meeting objectives in marketing and supply chain management. However, it was found that not much focus was highlighted in these studies to demonstrate the tools used for the analysis. To address this gap, this study also discusses the evolution, types and usage of big data tools. The brief overview of big data technologies grouped by the services they enable and some of their applications are presented. The study categorizes these tools into big data analysis platforms, databases and data warehouses, programming languages, search tools, and data aggregation and transfer tools. Finally, based on the review, future directions for exploration in big data has been provided for academic and practice.










Similar content being viewed by others
References
Agarwal, N., Chauhan, S., Kar, A. K., & Goyal, S. (2017). Role of human behaviour attributes in mobile crowd sensing: A systematic literature review. Digital Policy, Regulation and Governance, 19(2), 56–73.
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast discovery of association rules. Advances in Knowledge Discovery and Data Mining, 12(1), 307–328.
Aiyer, A. S., Bautin, M., Chen, G. J., Damania, P., Khemani, P., Muthukkaruppan, K., et al. (2012). Storage infrastructure behind facebook messages: Using HBase at scale. IEEE Data Engineering Bulletin, 35(2), 4–13.
Akter, S., Wamba, S. F., Gunasekaran, A., Dubey, R., & Childe, S. J. (2016). How to improve firm performance using big data analytics capability and business strategy alignment? International Journal of Production Economics, 182, 113–131.
Allen, M. (2016). Parametric polymorphism in the Go programming language. Retrieved from https://apps.cs.utexas.edu/tech_reports/reports/tr/TR-2231.pdf. 25 Jan 2017.
Allen, S. T., Jankowski, M., & Pathirana, P. (2015). Storm applied: Strategies for real-time event processing. New York, NY: Manning Publications Company.
Aloysius, J. A., Hoehle, H., & Venkatesh, V. (2016). Exploiting big data for customer and retailer benefits: A study of emerging mobile checkout scenarios. International Journal of Operations & Production Management, 36(4), 467–486.
Ammu, N., & Irfanuddin, M. (2013). Big data challenges. International Journal of Advanced Trends in Computer Science and Engineering, 2(1), 613–615.
Anders, S., Pyl, P. T., & Huber, W. (2015). HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics, 31(2), 166–169.
Ang, L. M., & Seng, K. P. (2016). Big sensor data applications in urban environments. Big Data Research, 4, 1–12.
Bafna, A., Wiens, J. (2015). Automated feature learning: Mining unstructured data for useful abstractions. In 2015 IEEE international conference on data mining (ICDM), (pp. 703–708). IEEE.
Bakshi, K. (2012). Considerations for big data: Architecture and approach. In 2012 IEEE on aerospace conference, (pp. 1–7). IEEE.
Beis, S., Papadopoulos, S., Kompatsiaris, Y. (2015). Benchmarking graph databases on the problem of community detection. In: N. Bassiliades et al. (Eds.), New trends in database and information systems II. Advances in intelligent systems and computing (Vol. 312). Cham: Springer.
Bello-Orgaz, G., Jung, J. J., & Camacho, D. (2016). Social big data: Recent achievements and new challenges. Information Fusion, 28, 45–59.
Bertsimas, D., Kallus, N., & Hussain, A. (2016). Inventory management in the era of big data. Production and Operations Management, 25(12), 2006–2009.
Bharathi, S. V. (2017). Prioritizing and ranking the big data information security risk spectrum. Global Journal of Flexible Systems Management, 18, 1–19. doi:10.1007/s40171-017-0157-5.
Bhardwaj, N. D. (2016). Comparative study of couchdb and mongodb–nosql document oriented databases. International Journal of Computer Applications, 136(3), 24–26.
Bhimani, A., & Willcocks, L. (2014). Digitisation, ‘big data’and the transformation of accounting information. Accounting and Business Research, 44(4), 469–490.
Birasnav, M., Mittal, R., & Loughlin, S. (2015). Linking leadership behaviors and information exchange to improve supply chain performance: A conceptual model. Global Journal of Flexible Systems Management, 16(2), 205–217.
Bock, S., & Isik, F. (2015). A new two-dimensional performance measure in purchase order sizing. International Journal of Production Research, 53(16), 4951–4962.
Bone, S. A., Fombelle, P. W., Ray, K. R., & Lemon, K. N. (2015). How customer participation in B2B peer-to-peer problem-solving communities influences the need for traditional customer service. Journal of Service Research, 18(1), 23–38.
Bradlow, E. T., Gangwar, M., Kopalle, P., & Voleti, S. (2017). The role of big data and predictive analytics in retailing. Journal of Retailing, 93(1), 79–95.
Brown, C. L., Cavusgil, S. T., & Lord, A. W. (2015). Country-risk measurement and analysis: A new conceptualization and managerial tool. International Business Review, 24(2), 246–265.
Calvard, T. S. (2016). Big data, organizational learning, and sensemaking: Theorizing interpretive challenges under conditions of dynamic complexity. Management Learning, 47(1), 65–82.
Cao, M., Chychyla, R., & Stewart, T. (2015). Big Data analytics in financial statement audits. Accounting Horizons, 29(2), 423–429.
Carlson, J. L. (2013). Redis in action. New York, NY: Manning Publications Company.
Cattuto, C., Quaggiotto, M., Panisson, A., &Averbuch, A. (2013). Time-varying social networks in a graph database: A Neo4j use case. In First international workshop on graph data management experiences and systems (p. 11). ACM.
Chae, B. K. (2015). Insights from hashtag# supplychain and Twitter analytics: Considering Twitter and Twitter data for supply chain practice and research. International Journal of Production Economics, 165, 247–259.
Chaffin, D., Heidl, R., Hollenbeck, J. R., Howe, M., Yu, A., Voorhees, C., et al. (2017). The promise and perils of wearable sensors in organizational research. Organizational Research Methods, 20(1), 3–31.
Chan, S. W., & Chong, M. W. (2017). Sentiment analysis in financial texts. Decision Support Systems, 94, 53–64.
Chang, R. M., Kauffman, R. J., & Kwon, Y. (2014). Understanding the paradigm shift to computational social science in the presence of big data. Decision Support Systems, 63, 67–80.
Chauhan, S., Agarwal, N., & Kar, A. K. (2016). Addressing big data challenges in smart cities: A systematic literature review. INFO, 18(4), 73–90.
Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165–1188.
Chen, D. Q., Preston, D. S., & Swink, M. (2015). How the use of big data analytics affects value creation in supply chain management. Journal of Management Information Systems, 32(4), 4–39.
Chong, A. Y. L., Li, B., Ngai, E. W., Ch’ng, E., & Lee, F. (2016). Predicting online product sales via online reviews, sentiments, and promotion strategies: A big data architecture and neural network approach. International Journal of Operations & Production Management, 36(4), 358–383.
Chowdary, B. V., & Muthineni, S. (2012). Selection of a flexible machining centre through a knowledge based expert system. Global Journal of Flexible Systems Management, 13(1), 3–10.
Cook, T. D. (2014). “Big data” in research on social policy. Journal of Policy Analysis and Management, 33(2), 544–547.
Culotta, A., & Cutler, J. (2016). Mining brand perceptions from twitter social networks. Marketing Science, 35(3), 343–362.
De Gennaro, M., Paffumi, E., & Martini, G. (2016). Big data for supporting low-carbon road transport policies in europe: Applications, challenges and opportunities. Big Data Research, 6, 11–25.
Decker, P. T. (2014). Presidential address: False choices, policy framing, and the promise of “Big Data”. Journal of Policy Analysis and Management, 33(2), 252–262.
Demirkan, H., & Delen, D. (2013). Leveraging the capabilities of service-oriented decision support systems: Putting analytics and big data in cloud. Decision Support Systems, 55(1), 412–421.
Dijcks, J. P. (2012). Oracle: Big data for the enterprise. Oracle white paper. Retrieved from http://www.oracle.com/us/products/database/big-data-for-enterprise-519135.pdf. 2 Dec 2016.
Dolnicar, S., & Ring, A. (2014). Tourism marketing research: Past, present and future. Annals of Tourism Research, 47, 31–47.
Donnelly, C., Simmons, G., Armstrong, G., & Fearne, A. (2015). Digital loyalty card ‘big data’and small business marketing: Formal versus informal or complementary? International Small Business Journal, 33(4), 422–442.
Du, R. Y., Hu, Y., & Damangir, S. (2015). Leveraging trends in online searches for product features in market response modeling. Journal of Marketing, 79(1), 29–43.
Durahim, A. O., & Coşkun, M. (2015). # iamhappybecause: gross national happiness through Twitter analysis and big data. Technological Forecasting and Social Change, 99, 92–105.
Dutta, D., & Bose, I. (2015). Managing a big data project: The case of ramco cements limited. International Journal of Production Economics, 165, 293–306.
Edelman, A. (2015, May). Julia: A fresh approach to parallel programming. In 2015 IEEE international conference on parallel and distributed processing symposium (IPDPS), (pp. 517-517). IEEE.
Edwards, D., Cheng, M., Wong, I. A., Zhang, J., & Wu, Q. (2016). Ambassadors of knowledge sharing: Co-produced travel information through tourist-local social media exchange. International Journal of Contemporary Hospitality Management. doi:10.1108/IJCHM-10-2015-0607.
Edwards, D. J., Pärn, E., Love, P. E., & El-Gohary, H. (2017). Research note: Machinery, manumission, and economic machinations. Journal of Business Research, 70, 391–394.
Ellen, I. G., Horn, K. M., & Schwartz, A. E. (2016). Why don’t housing choice voucher recipients live near better schools? insights from big data. Journal of Policy Analysis and Management, 35(4), 884–905.
Erevelles, S., Fukawa, N., & Swayne, L. (2016). Big data consumer analytics and the transformation of marketing. Journal of Business Research, 69(2), 897–904.
Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National Science Review, 1(2), 293–314.
Flyverbom, M., Madsen, A. K., & Rasche, A. (2017). Big data as governmentality in international development: Digital traces, algorithms, and altered visibilities. The Information Society, 33(1), 35–42.
France, S. L., & Ghose, S. (2016). An analysis and visualization methodology for identifying and testing market structure. Marketing Science, 35(1), 182–197.
Franke, C., Morin, S., Chebotko, A., Abraham, J., & Brazier, P. (2011). Distributed semantic web data management in HBase and MySQL cluster. In 2011 IEEE international conference on cloud computing (CLOUD), (pp. 105–112). IEEE.
Fulgoni, G. (2013). Big data: Friend or foe of digital advertising? Five ways marketers should use digital big data to their advantage. Journal of Advertising Research, 53(4), 372–376.
Graham, G., & Mehmood, R. (2014). The strategic prototype “crime-sourcing” and the science/science fiction behind it. Technological Forecasting and Social Change, 84, 86–92.
Grainger, T., Potter, T., & Seeley, Y. (2014). Solr in action. Cherry Hill: Manning Publications.
Green, K. C., & Armstrong, J. S. (2015). Simple versus complex forecasting: The evidence. Journal of Business Research, 68(8), 1678–1685.
Greenberg, G. (2013). Small firms, big patents? Estimating patent value using data on Israeli start-ups’ financing rounds. European Management Review, 10(4), 183–196.
Gunter, U., & Önder, I. (2016). Forecasting city arrivals with Google analytics. Annals of Tourism Research, 61, 199–212.
Hahn, G. J., & Packowski, J. (2015). A perspective on applications of in-memory analytics in supply chain management. Decision Support Systems, 76, 45–52.
Hanke, M., Halchenko, Y. O., Sederberg, P. B., Olivetti, E., Fründ, I., Rieger, J. W., et al. (2009). PyMVPA: a unifying approach to the analysis of neuroscientific data. Frontiers in Neuroinformatics. doi:10.3389/neuro.11.003.2009.
Hansen, H. K., & Flyverbom, M. (2015). The politics of transparency and the calibration of knowledge in the digital age. Organization, 22(6), 872–889.
Hartmann, P. M., Hartmann, P. M., Zaki, M., Zaki, M., Feldmann, N., Feldmann, N., et al. (2016). Capturing value from big data–a taxonomy of data-driven business models used by start-up firms. International Journal of Operations & Production Management, 36(10), 1382–1406.
Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98–115.
Hazen, B. T., Boone, C. A., Ezell, J. D., & Jones-Farmer, L. A. (2014). Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics, 154, 72–80.
He, J., Liu, H., & Xiong, H. (2016). SocoTraveler: Travel-package recommendations leveraging social influence of different relationship types. Information & Management, 53(8), 934–950.
Höfer, C. N., & Karagiannis, G. (2011). Cloud computing services: Taxonomy and comparison. Journal of Internet Services and Applications, 2(2), 81–94.
Huang, T., Lan, L., Fang, X., An, P., Min, J., & Wang, F. (2015). Promises and challenges of big data computing in health sciences. Big Data Research, 2(1), 2–11.
Huang, T., & Van Mieghem, J. A. (2014). Clickstream data and inventory management: Model and empirical analysis. Production and Operations Management, 23(3), 333–347.
Hussain, M., Al-Mourad, M., Mathew, S., & Hussein, A. (2017). Mining educational data for academic accreditation: Aligning assessment with outcomes. Global Journal of Flexible Systems Management, 18(1), 51–60.
Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3), 299–314.
Incardona, M. F., Bourenkov, G. P., Levik, K., Pieritz, R. A., Popov, A. N., & Svensson, O. (2009). EDNA: A framework for plugin-based applications applied to X-ray experiment online data analysis. Journal of Synchrotron Radiation, 16(6), 872–879.
Jachs, B., Blanco, M. J., Grantham-Hill, S., & Soto, D. (2015). On the independence of visual awareness and metacognition: A signal detection theoretic analysis. Journal of Experimental Psychology: Human Perception and Performance, 41(2), 269.
Jarmin, R. S., & O’Hara, A. B. (2016). Big data and the transformation of public policy analysis. Journal of Policy Analysis and Management, 35(3), 715–721.
Jin, J., Liu, Y., Ji, P., & Liu, H. (2016). Understanding big consumer opinion data for market-driven product design. International Journal of Production Research, 54(10), 3019–3041.
Jin, X., Wah, B. W., Cheng, X., & Wang, Y. (2015). Significance and challenges of big data research. Big Data Research, 2(2), 59–64.
Jordan, G. (2014). Querying. In Practical Neo4j (pp. 39–48). Apress. doi:10.1007/978-1-4842-0022-3.
Joseph, N., Kar, A. K., Ilavarasan, V., & Ganesh, S. (2017). Review of discussions on internet of things (IoT): insights from twitter analytics. Journal of Global Information Management, 25(2), 38–51.
Jun, C. N., & Chung, C. J. (2016). Big data analysis of local government 3.0: Focusing on Gyeongsangbuk-do in Korea. Technological Forecasting and Social Change, 110, 3–12.
Jun, S. P., Park, D. H., & Yeom, J. (2014). The possibility of using search traffic information to explore consumer product attitudes and forecast consumer preference. Technological Forecasting and Social Change, 86, 237–253.
Kaisler, S., Armour, F., Espinosa, J. A., & Money, W. (2013). Big data: Issues and challenges moving forward. In 2013 46th Hawaii international conference on system sciences (HICSS), (pp. 995–1004). IEEE.
Kallinikos, J., & Constantiou, I. D. (2015). Big data revisited: A rejoinder. Journal of Information Technology, 30(1), 70–74.
Kar, A. K., & Rakshit, A. (2015). Flexible pricing models for cloud computing based on group decision making under consensus. Global Journal of Flexible Systems Management, 16(2), 1–14.
Khaitan, S. K., & McCalley, J. D. (2015). PARAGON: An approach for parallelization of power system contingency analysis using Go programming language. International Transactions on Electrical Energy Systems, 25(11), 2909–2920.
Khetrapal, A., Ganesh, V. (2006). HBase and Hypertable for large scale distributed storage systems. Dept. of Computer Science, Purdue University. Retrieved from http://cloud.pubs.dbs.uni-leipzig.de/sites/cloud.pubs.dbs.uni-leipzig.de/files/Khetrapal2008HBaseandHypertableforlargescaledistributedstorage.pdf. 25 Jan 2017.
Kim, J., Lee, Y. O., & Park, H. W. (2016). Delineating the complex use of a political podcast in South Korea by hybrid web indicators: The case of the Nakkomsu Twitter network. Technological Forecasting and Social Change, 110, 42–50.
Kitchin, R. (2014). Big data, new epistemologies and paradigm shifts. Big Data & society, 1(1), 1–12.
Kotsiantis, S., & Kanellopoulos, D. (2006). Association rules mining: A recent overview. GESTS International Transactions on Computer Science and Engineering, 32(1), 71–82.
Krahel, J. P., & Titera, W. R. (2015). Consequences of big data and formalization on accounting and auditing standards. Accounting Horizons, 29(2), 409–422.
Kude, T., Kude, T., Hoehle, H., Hoehle, H., Sykes, T. A., & Sykes, T. A. (2017). Big data breaches and customer compensation strategies: Personality traits and social influence as antecedents of perceived compensation. International Journal of Operations & Production Management, 37(1), 56–74.
Kumar, M., Graham, G., Hennelly, P., & Srai, J. (2016). How will smart city production systems transform supply chain design: A product-level investigation. International Journal of Production Research, 54(23), 7181–7192.
Kumar, B. S., & Rukmani, K. V. (2010). Implementation of web usage mining using APRIORI and FP growth algorithms. International Journal of Advanced networking and Applications, 1(06), 400–404.
Kv, R. Satish, & Kavya, N. P. (2016). Trend analysis of e-commerce data using Hadoop ecosystem. International Journal of Computer Applications, 147(6), 1–5.
Kwon, T. H., Kwak, J. H., & Kim, K. (2015). A study on the establishment of policies for the activation of a big data industry and prioritization of policies: Lessons from Korea. Technological Forecasting and Social Change, 96, 144–152.
Lakhe, B. (2016). Implementing SQOOP and Flume-based Data Transfers. In Practical Hadoop Migration. Retrieved from https://link.springer.com/chapter/10.1007/978-1-4842-1287-5_8. 25 Jan 2017.
Lakhiwal, A. Kar, A.K. (2016). Insights from Twitter analytics: Modeling social media personality dimensions and impact of breakthrough events. Lecture Notes in Computer Science, vol. 9844, pp 533–544
Lakshman, A., Malik, P. (2009). Cassandra: structured storage system on a p2p network. In Proceedings of the 28th ACM symposium on principles of distributed computing (pp. 5–5). ACM.
Lakshman, A., & Malik, P. (2010). Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2), 35–40.
Lam, S. K., Sleep, S., Hennig-Thurau, T., Sridhar, S., & Saboo, A. R. (2017). Leveraging frontline employees’ small data and firm-level big data in frontline management an absorptive capacity perspective. Journal of Service Research, 20(1), 12–28.
Lane, J. (2016). Big data for public policy: The quadruple helix. Journal of Policy Analysis and Management, 35(3), 708–715.
Lane, J., & Decker, P. T. (2016). Editors’ overview of special section on big data and public policy. Journal of Policy Analysis and Management, 35(4), 881–883.
LaValle, S., Lesser, E., Shockley, R., Hopkins, M. S., & Kruschwitz, N. (2011). Big data, analytics and the path from insights to value. MIT Sloan Management Review, 52(2), 21.
Lavertu, S. (2015). We all need help:“Big data” and the mismeasure of public administration. Public administration review. Retrieved from http://onlinelibrary.wiley.com/doi/10.1111/puar.12436/pdf. 25 Jan 2017.
Lee, C. K. H. (2017). A GA-based optimisation model for big data analytics supporting anticipatory shipping in Retail 4.0. International Journal of Production Research, 55(2), 593–605.
Lee, W. S., Han, E. J., & Sohn, S. Y. (2015). Predicting the pattern of technology convergence using big-data technology on large-scale triadic patents. Technological Forecasting and Social Change, 100, 317–329.
Lennon, J. (2009). Introduction to CouchDB Views. In Beginning CouchDB (pp. 107–123). doi: 10.1007/978-1-4302-7236-6_7.
LexisNexis Risk Solutions, LexiNexis. (2012). HPCC systems for cyber security analytics. New York, NY: LexisNexis Risk Solutions, LexiNexis. doi:10.1007/978-3-319-44550-2_12.
Li, C. (2010). Transforming relational database into HBase: A case study. In 2010 IEEE international conference on software engineering and service sciences (pp. 683–687). IEEE.
Li, B., Ch’ng, E., & Chong, A. Y. L. (2016a). Predicting online e-marketplace sales performances: A big data approach. Computers & Industrial Engineering, 101, 565–571.
Li, X., Jiang, T., & Ruiz, R. (2016b). Heuristics for periodical batch job scheduling in a mapreduce computing framework. Information Sciences, 326, 119–133.
Li, J., Li, X., & Zhu, B. (2016c). User opinion classification in social media: A global consistency maximization approach. Information & Management, 53(8), 987–996.
Li, X., Pan, B., Law, R., & Huang, X. (2017). Forecasting tourism demand with composite search index. Tourism Management, 59, 57–66.
Li, J. Q., Rusmevichientong, P., Simester, D., Tsitsiklis, J. N., & Zoumpoulis, S. I. (2015a). The value of field experiments. Management Science, 61(7), 1722–1740.
Li, J., Tao, F., Cheng, Y., & Zhao, L. (2015b). Big data in product lifecycle management. The International Journal of Advanced Manufacturing Technology, 81(1–4), 667–684.
Liu, Y., Teichert, T., Rossi, M., Li, H., & Hu, F. (2017). Big data for big insights: Investigating language-specific drivers of hotel satisfaction with 412,784 user-generated reviews. Tourism Management, 59, 554–563.
Liu, X., & Ye, Q. (2016). The different impacts of news-driven and self-initiated search volume on stock prices. Information & Management, 53(8), 997–1005.
Loebbecke, C., & Picot, A. (2015). Reflections on societal and business model transformation arising from digitization and big data analytics: A research agenda. The Journal of Strategic Information Systems, 24(3), 149–157.
Lubin, M., & Dunning, I. (2015). Computing in operations research using Julia. INFORMS Journal on Computing, 27(2), 238–248.
Lukoianova, T., & Rubin, V. L. (2014). Veracity roadmap: Is big data objective, truthful and credible? Advances in Classification Research Online, 24(1), 4–15.
Lux, M., Chatzichristofis, S. A. (2008). Lire: lucene image retrieval: An extensible java cbir library. In Proceedings of the 16th ACM international conference on multimedia (pp. 1085–1088). ACM.
Madan, P., & Saxena, A. (2014). Review: Graph databases. International Journal, 4(5), 195–200.
Maklan, S., Peppard, J., & Klaus, P. (2015). Show me the money: Improving our understanding of how organizations generate return from technology-led marketing change. European Journal of Marketing, 49(3/4), 561–595.
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute. Retrieved from http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation. 25 Jan 2017.
Mariani, M. M., Di Felice, M., & Mura, M. (2016). Facebook as a destination marketing tool: Evidence from Italian regional destination management organizations. Tourism Management, 54, 321–343.
Martin, K. (2016). Data aggregators, consumer data, and responsibility online: Who is tracking consumers online and should they stop? The Information Society, 32(1), 51–63.
Martin, K. D., Borah, A., & Palmatier, R. W. (2017). Data privacy: Effects on customer and firm performance. Journal of Marketing, 81(1), 36–58.
Martin, K. D., & Murphy, P. E. (2017). The role of data privacy in marketing. Journal of the Academy of Marketing Science, 45, 135–155.
Matthias, O., Fouweather, I., Gregory, I., & Vernon, A. (2017). Making sense of big data–can it transform operations management? International Journal of Operations & Production Management, 37(1), 37–55.
McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: The management revolution. Harvard Business Review, 90(10), 61–67.
McCandless, M., Hatcher, E., & Gospodnetic, O. (2010). Lucene in action: covers Apache Lucene 3.0. New York, NY: Manning Publications Company.
Mehmood, R., Meriton, R., Graham, G., & Kumar, M. (2017). Exploring the influence of big data on city transport operations: a Markovian approach. International Journal of Operations & Production Management, 37(1), 75–104.
Mergel, I., Rethemeyer, R. K., & Isett, K. (2016). Big data in public affairs. Public Administration Review, 76(6), 928–937.
Middleton, A. M. (2011). HPCC Systems: Introduction to HPCC (High Performance Computer Cluster). White paper, LexisNexis Risk Solutions. Retrieved from http://cdn.hpccsystems.com/whitepapers/wp_introduction_HPCC.pdfJanuary 25, 2017.
Milas, G., & Mlačić, B. (2007). Brand personality and human personality: Findings from ratings of familiar Croatian brands. Journal of Business Research, 60(6), 620–626.
Moeyersoms, J., & Martens, D. (2015). Including high-cardinality attributes in predictive models: A case study in churn prediction in the energy sector. Decision Support Systems, 72, 72–81.
Montag, D. (2013). Understanding neo4j scalability. White Paper, Neotechnology. Retrieved from http://info.neotechnology.com/rs/neotechnology/images/Understanding%20Neo4j%20Scalability(2).pdf. Accessed on 25 Jan 2017.
Newman, M. E. (2012). Communities, modules and large-scale structure in networks. Nature Physics, 8(1), 25–31.
Njuguna, C., & McSharry, P. (2017). Constructing spatiotemporal poverty indices from big data. Journal of Business Research, 70, 318–327.
Nodarakis, N., Sioutas, S., Tsakalidis, A., Tzimas, G. (2016). Using Hadoop for Large Scale Analysis on Twitter: A Technical Report. Retrieved from https://arxiv.org/abs/1602.01248. Accessed on 25 Jan 2017.
Nour, S., Sumita, U., & Yoshii, J. (2015). Development of enhanced marketing flexibility by optimally allocating sales campaign days for maximizing total expected sales. Global Journal of Flexible Systems Management, 16(1), 87–95.
Nudurupati, S. S., Tebboune, S., & Hardman, J. (2016). Contemporary performance measurement and management (PMM) in digital economies. Production Planning & Control, 27(3), 226–235.
Öberg, C., & Graham, G. (2016). How smart cities will change supply chain management: A technical viewpoint. Production Planning & Control, 27(6), 529–538.
Olston, C., Reed, B., Srivastava, U., Kumar, R., & Tomkins, A. (2008). Pig latin: A not-so-foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD international conference on management of data (pp. 1099–1110). ACM.
O’Malley, M. (2014). Doing what works: Governing in the age of big data. Public Administration Review, 74(5), 555–556.
Palanisamy, R., & Foshay, N. (2013). Impact of user’s internal flexibility and participation on usage and information systems flexibility. Global Journal of Flexible Systems Management, 14(4), 195–209.
Pigni, F., Piccoli, G., & Watson, R. (2016). Digital data streams. California Management Review, 58(3), 5–25.
Pournarakis, D. E., Sotiropoulos, D. N., & Giaglis, G. M. (2017). A computational model for mining consumer perceptions in social media. Decision Support Systems, 93, 98–110.
Pousttchi, K., & Hufenbach, Y. (2014). Engineering the value network of the customer interface and marketing in the data-rich retail environment. International Journal of Electronic Commerce, 18(4), 17–42.
Prasad, P. D., Vivekanandan, T., & Srinivasan, A. (2015). A Methodology for WebLog Data analysis using HadoopMapReduce and PIG. i-manager’s Journal on Cloud Computing, 3(1), 13.
Priya, M., & Ranjith Kumar, P. (2015). A novel intelligent approach for predicting atherosclerotic individuals from big data for healthcare. International Journal of Production Research, 53(24), 7517–7532.
Qi, J., Zhang, Z., Jeon, S., & Zhou, Y. (2016). Mining customer requirements from online reviews: A product improvement perspective. Information & Management, 53(8), 951–963.
Rabkin, A. and Katz, R. (2010) Chukwa: A system for reliable large-scale log collection. In USENIX conference on large installation system administration, pp. 1–15.
Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in healthcare: Promise and potential. Health Information Science and Systems, 2(1), 1.
Rahman, M. N., Esmailpour, A., & Zhao, J. (2016a). Machine learning with big data an efficient electricity generation forecasting system. Big Data Research, 5, 9–15.
Rahman, M. N. A., Seyal, A. H., Tajuddin, S. T., & Azmi, H. M. (2016b). Feasibility study of MongoDB and radio frequency identification technology in asset tracking system. World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 10(5), 745–751.
Raun, J., Ahas, R., & Tiru, M. (2016). Measuring tourism destinations using mobile tracking data. Tourism Management, 57, 202–212.
Ringel, D. M., & Skiera, B. (2016). Visualizing asymmetric competition among more than 1,000 products using big search data. Marketing Science, 35(3), 511–534.
Ross, J. W., Beath, C. M., & Quaadgras, A. (2013). You may not need big data after all. Harvard Business Review, 91(12), 90.
Rust, R. T., & Huang, M. H. (2014). The service revolution and the transformation of marketing science. Marketing Science, 33(2), 206–221.
Ruthotto, L., Treister, E., & Haber, E. (2016). jInv–a flexible Julia package for PDE parameter estimation. Retrieved from https://arxiv.org/abs/1606.07399. 25 Jan 2017.
Rychly, M. (2014, July). Scheduling decisions in stream processing on heterogeneous clusters. In 2014 eighth international conference on complex, intelligent and software intensive systems (CISIS), (pp. 614–619). IEEE.
Salehan, M., & Kim, D. J. (2016). Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics. Decision Support Systems, 81, 30–40.
Sambamurthy, V., Bharadwaj, A., & Grover, V. (2003). Shaping agility through digital options: Reconceptualizing the role of information technology in contemporary firms. MIS Quarterly, 27(2), 237–263.
Sanders, N. R. (2016). How to use big data to drive your supply chain. California Management Review, 58(3), 26–48.
Sarnovsky, M., & Ulbrik, Z. (2013). Cloud-based clustering of text documents using the GHSOM algorithm on the GridGain platform. In 2013 IEEE 8th international symposium on applied computational intelligence and informatics (SACI), (pp. 309–313). IEEE.
Scanfeld, D., Scanfeld, V., & Larson, E. L. (2010). Dissemination of health information through social networks: Twitter and antibiotics. American Journal of Infection Control, 38(3), 182–188.
Schmidt, D., Chen, W. C., Matheson, M. A., & Ostrouchov, G. (2016). Programming with BIG data in R: Scaling analytics from one to thousands of nodes. Big Data Research, 8, 1–11.
Schneider, M. J., & Gupta, S. (2016). Forecasting sales of new and existing products using consumer reviews: A random projections approach. International Journal of Forecasting, 32(2), 243–256.
Seddon, J. J., & Currie, W. L. (2017). A model for unpacking big data analytics in high-frequency trading. Journal of Business Research, 70, 300–307.
Shah, N., Irani, Z., & Sharif, A. M. (2017). Big data in an HR context: Exploring organizational change readiness, employee attitudes and behaviors. Journal of Business Research, 70, 366–378.
Shang, W., Adams, B., & Hassan, A. E. (2012). Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report. Journal of Systems and Software, 85(10), 2195–2204.
Singh, A. (2013). Social media and corporate agility. Global Journal of Flexible Systems Management, 14(4), 255–260.
Singh, A. N., Picot, A., Kranz, J., Gupta, M. P., & Ojha, A. (2013). Information security management (ism) practices: Lessons from select cases from India and Germany. Global Journal of Flexible Systems Management, 14(4), 225–239.
Ślezak, D., Eastwood, V. (2009). Data warehouse technology by infobright. In Proceedings of the 2009 ACM SIGMOD international conference on management of data (pp. 841–846). ACM.
Smith, D. S., Li, X., Arlinghaus, L. R., Yankeelov, T. E., Welch, E. B. (2015). DCEMRI. jl: A fast, validated, open source toolkit for dynamic contrast enhanced MRI analysis. Retrieved from https://peerj.com/articles/909/. 25 Jan 2017.
Srai, J. S., Kumar, M., Graham, G., Phillips, W., Tooze, J., Ford, S., et al. (2016). Distributed manufacturing: Scope, challenges and opportunities. International Journal of Production Research, 54(23), 6917–6935.
Sun, E. W., Chen, Y. T., & Yu, M. T. (2015). Generalized optimal wavelet decomposing algorithm for big financial data. International Journal of Production Economics, 165, 194–214.
Sung, T. K. (2015). The creative economy in global competition. Technological Forecasting and Social Change, 96, 89–91.
Sushil (2017) Multi-criteria valuation of flexibility initiatives using integrated TISM – IRP with a big data framework. Production Planning & Control. doi:10.1080/09537287.2017.1336794.
Suthaharan, S. (2014). Big data classification: Problems and challenges in network intrusion prediction with machine learning. ACM SIGMETRICS Performance Evaluation Review, 41(4), 70–73.
Tallon, P. P., Ramirez, R. V., & Short, J. E. (2013). The information artifact in IT governance: Toward a theory of information governance. Journal of Management Information Systems, 30(3), 141–178.
Tambe, P. (2014). Big data investment, skills, and firm value. Management Science, 60(6), 1452–1469.
Tan, K. H., Zhan, Y., Ji, G., Ye, F., & Chang, C. (2015). Harvesting big data to enhance supply chain innovation capabilities: An analytic infrastructure based on deduction graph. International Journal of Production Economics, 165, 223–233.
Tayal, A., & Singh, S. P. (2016). Integrating big data analytic and hybrid firefly-chaotic simulated annealing approach for facility layout problem. Annals of Operations Research. doi:10.1007/s10479-016-2237-x.
Thackeray, R., Neiger, B. L., Hanson, C. L., & McKenzie, J. F. (2008). Enhancing promotional strategies within social marketing programs: use of Web 2.0 social media. Health Promotion Practice, 9(4), 338–343.
Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Anthony, S., et al. (2009). Hive: A warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment, 2(2), 1626–1629.
Tillmanns, S., Ter Hofstede, F., Krafft, M., & Goetz, O. (2017). How to separate the wheat from the chaff: Improved variable selection for new customer acquisition. Journal of Marketing, 81(2), 99–113.
Trusov, M., Ma, L., & Jamal, Z. (2016). Crumbs of the cookie: User profiling in customer-base analysis and behavioral targeting. Marketing Science, 35(3), 405–426.
Wang, Y., & Hajli, N. (2017). Exploring the path to big data analytics success in healthcare. Journal of Business Research, 70, 287–299.
Wang, P., Liu, B., & Hong, T. (2016). Electric load forecasting with recency effect: A big data approach. International Journal of Forecasting, 32(3), 585–597.
Wang, G., Tang, J. (2012, August). The NoSQL principles and basic application of cassandra model. In 2012 international conference on computer science & service system (CSSS), (pp. 1332–1335). IEEE.
Wang, J., & Zhang, J. (2016). Big data analytics for forecasting cycle time in semiconductor wafer fabrication system. International Journal of Production Research, 54(23), 7231–7244.
Warren, J. D., Jr., Moffitt, K. C., & Byrnes, P. (2015). How big data will change accounting. Accounting Horizons, 29(2), 397–407.
Wedel, M., & Kannan, P. K. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80(6), 97–121.
Weinhardt, C., Anandasivam, A., Blau, B., Borissov, N., Meinl, T., Michalk, W., et al. (2009). Cloud computing–a classification, business models, and research directions. Business & Information Systems Engineering, 1(5), 391–399.
Wei-ping, Z., Ming-Xin, L. I., Huan, C. (2011). Using MongoDB to implement textbook management system instead of MySQL. In 2011 IEEE 3rd international conference on communication software and networks (ICCSN), (pp. 303–305). IEEE.
Winkler, M., Abrahams, A. S., Gruss, R., & Ehsani, J. P. (2016). Toy safety surveillance from online reviews. Decision Support Systems, 90, 23–32.
Wu, J., Li, H., Cheng, S., & Lin, Z. (2016). The promising future of healthcare services: When big data analytics meets wearable technology. Information & Management, 53(8), 1020–1033.
Xiang, Z., Schwartz, Z., Gerdes, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction? International Journal of Hospitality Management, 44, 120–130.
Xie, K., Wu, Y., Xiao, J., & Hu, Q. (2016). Value co-creation between firms and customers: The role of big data-based cooperative assets. Information & Management, 53(8), 1034–1048.
Xu, Z., Frankwick, G. L., & Ramirez, E. (2016). Effects of big data analytics and traditional marketing analytics on new product success: A knowledge fusion perspective. Journal of Business Research, 69(5), 1562–1566.
Xudong, X., Rui, G. (2016). Research on Storage and Processing of MongoDB for Laser Point Cloud under Distribution. In 3rd international conference on materials engineering, manufacturing technology and control (pp. 1559–1564). Atlantis-press.
Yang, Y., Pan, B., & Song, H. (2014). Predicting hotel demand using destination marketing organization’s web traffic data. Journal of Travel Research, 53(4), 433–447.
Yin, S., & Kaynak, O. (2015). Big data for modern industry: Challenges and trends [Point of View]. Proceedings of the IEEE, 103(2), 143–146.
Yoon, K., Hoogduin, L., & Zhang, L. (2015). Big data as complementary audit evidence. Accounting Horizons, 29(2), 431–438.
Zaki, M. J. (2000). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3), 372–390.
Zaslavsky, A., Perera, C., Georgakopoulos, D. (2013). Sensing as a service and big data. Retrieved from https://arxiv.org/abs/1301.0159. 23 Dec 2016.
Zhang, L., Lan, C., Qi, F., & Wu, P. (2017). Development pattern, classification and evaluation of the tourism academic community in China in the last ten years: From the perspective of big data of articles of tourism academic journals. Tourism Management, 58, 235–244.
Zhang, J., Yang, X., & Appelbaum, D. (2015). Toward effective big data analysis in continuous auditing. Accounting Horizons, 29(2), 469–476.
Zhang, Y., Zhang, G., Chen, H., Porter, A. L., Zhu, D., & Lu, J. (2016). Topic analysis and forecasting for science, technology and innovation: Methodology with a case study focusing on big data research. Technological Forecasting and Social Change, 105, 179–191.
Zhong, R. Y., Huang, G. Q., Lan, S., Dai, Q. Y., Chen, X., & Zhang, T. (2015). A big data approach for logistics trajectory discovery from RFID-enabled production data. International Journal of Production Economics, 165, 260–272.
Zhou, Z., Dou, W., Jia, G., Hu, C., Xu, X., Wu, X., et al. (2016). A method for real-time trajectory monitoring to improve taxi service using GPS big data. Information & Management, 53(8), 964–977.
Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class hadoop and streaming data. New York, NY: McGraw-Hill Osborne Media.
Zuboff, S. (2015). Big other: Surveillance capitalism and the prospects of an information civilization. Journal of Information Technology, 30(1), 75–89.
Author information
Authors and Affiliations
Corresponding author
Additional information
Run the paper on a plagiarism check software like Turnitin and eliminate similarities from other publications. The paper was run on Turnitin, and Turnitin reported only 13% similarity.
Rights and permissions
About this article
Cite this article
Grover, P., Kar, A.K. Big Data Analytics: A Review on Theoretical Contributions and Tools Used in Literature. Glob J Flex Syst Manag 18, 203–229 (2017). https://doi.org/10.1007/s40171-017-0159-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40171-017-0159-3