Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns
Abstract
:1. Introduction
2. Related Work
2.1. Crowdfunding
2.2. Deception, Fraud and Linguistic Cues
2.3. Detecting Fraudulent Crowdfunding Projects
3. Methodology
3.1. Dataset
3.2. Features
3.2.1. Generic Project-Based Features
3.2.2. Project Creator’s Features
3.2.3. Linguistic Features
3.2.4. Named Entity Recognition (NER)
3.3. Performance Metrics
- (Overall) accuracy: the ratio of the projects correctly classified as scams or non-scams to the total number of all projects contained in our dataset. We apply this metric to measure the accuracy of a classifier on our whole dataset.
- AUC: AUC is the area under the Receiver Operating Characteristic (ROC) curve. ROC is a probability curve that shows the True Positive Ratio (TPR) against False Positive Ratio (FPR) at various threshold values and the performance of a classification model. AUC ranges in value from 0 to 1. If the model’s prediction accuracy is 100%, the AUC score is 1.
- Precision: the ratio of True Positives over the sum of True Positives and False Positives or the percentage of campaigns that are properly attributed to a given class (scam). True Positives are the number of correctly classified scams, False Positives are the number of non-scam projects falsely ascribed to scam, and False Negatives are the number of scam projects that are falsely labeled as non-scam.
- Recall: the ratio of True Positives over the sum of True Positives and False Negatives or the percentage of scam projects (in our dataset) that are correctly identified.
4. Results
4.1. Distinguishing Characteristics of Scams
4.1.1. Creator-Related Features
4.1.2. Features from Campaign Section
4.1.3. Features from Updates Section
4.1.4. Features from Comments Section
4.2. Detecting Scams: Performance Evaluation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
The following abbreviations are used in this manuscript: | |
FinCEN | Financial Crimes Enforcement Network |
FTC | Federal Trade Commission |
VC | Venture Capital |
NER | Named Entity Recognition |
ROC | Receiver Operating Characteristic |
AUC | Area Under the Receiver Operating Characteristic |
TPR | True Positive Ratio |
FPR | False Positive Ratio |
CDF | Cumulative Distribution Function |
SVM | Support Vector Machine |
KNN | k-Nearest Neighbor |
VIF | Variance Inflation Factors |
References
- Kickstarter: Total Amount of Dollars Pledged 2022. Available online: https://www.statista.com/statistics/310218/total-kickstarter-funding/ (accessed on 21 June 2022).
- Crowdfunding Industry Overtakes Venture Capital and Angel Investing. Available online: https://blog.symbid.com/symbid-blog/trends/crowdfunding-industry-overtakes-venture-capital-and-angel-investing (accessed on 21 June 2022).
- Cumming, D.J.; Leboeuf, G.; Schwienbacher, A. Crowdfunding Models: Keep-It-All vs. All-or-Nothing. Available online: https://ssrn.com/abstract=2447567 (accessed on 21 June 2022).
- Kickstarter Stats. Available online: https://www.kickstarter.com/help/stats?ref=about_subnav (accessed on 21 June 2022).
- Moores, C. Kickstart My Lawsuit: Fraud and Justice in Rewards-Based Crowdfunding. UC Davis Law Rev. 2015, 49, 383–424. [Google Scholar]
- Ho, T.H. Social Purpose Corporations: The Next Targets for Greenwashing Practices and Crowdfunding Scams. Seattle J. Soc. Justice 2015, 13, 935. [Google Scholar]
- Kobe Beef Jerky Kickstarter Fraud Nearly Scammed Backers out of $120,000. Available online: https://money.cnn.com/2013/06/17/technology/kickstarter-scam-kobe-jerky/ (accessed on 21 June 2022).
- Likely MAJOR Kickstarter FRAUD Uncovered: Kobe Red: Kickstarter. Available online: https://www.reddit.com/r/kickstarter/comments/1g9utn/likely_major_kickstarter_fraud_uncovered_kobe_red/ (accessed on 21 June 2022).
- SAR Technical Bulletins|FinCEN.gov. Available online: https://www.fincen.gov/sites/default/files/sar_report/SAR_Stats_2_FINAL.pdf (accessed on 21 June 2022).
- US Treasury Publishes Suspicious Activity Report Highlighting Crowdfunding Scams & Frauds. Available online: https://www.crowdfundinsider.com/2015/10/75936-us-treasury-publishes-suspicious-activity-report-highlighting-crowdfunding-scams-frauds/ (accessed on 21 June 2022).
- Game over: FTC Goes after Board Game Campaign Gone Wrong in First Crowdfunding Case. Available online: https://www.washingtonpost.com/news/the-switch/wp/2015/06/11/the-ftcs-first-crowdfunding-enforcement-is-over-a-failed-board-game-on-kickstarter/ (accessed on 21 June 2022).
- Crowdfunding Fraud: How Big Is the Threat? Available online: https://www.crowdfundinsider.com/2014/03/34255-crowdfunding-fraud-big-threat/ (accessed on 21 June 2022).
- Crowdfunding Scams Top Investor Threat: Regulators. Available online: https://www.reuters.com/article/us-investing-scams-threats-idUSBRE87K17W20120821 (accessed on 21 June 2022).
- The Stanford Natural Language Processing Group. Available online: https://nlp.stanford.edu/software/CRF-NER.html (accessed on 21 June 2022).
- Logistic Regression Variable Selection Methods—IBM Documentation. Available online: https://www.ibm.com/docs/en/spss-statistics/28.0.0?topic=regression-logistic-variable-selection-methods (accessed on 21 June 2022).
- Mollick, E. The dynamics of crowdfunding: An exploratory study. J. Bus. Ventur. 2014, 29, 1–16. [Google Scholar] [CrossRef] [Green Version]
- Mollick, E. Delivery Rates on Kickstarter. Available online: https://ssrn.com/abstract=2699251 (accessed on 21 June 2022).
- Greenberg, M.D.; Pardo, B.; Hariharan, K.; Gerber, E. Crowdfunding support tools: Predicting success & failure. In Proceedings of the Extended Abstracts on Human Factors in Computing Systems (CHI EA 2013), Paris, France, 27 March–2 April 2013; pp. 1815–1820. [Google Scholar]
- Evers, M.W. Main Drivers of Crowdfunding Success; Erasmus University: Rotterdam, The Netherlands, 2012. [Google Scholar]
- Xu, A.; Yang, X.; Rao, H.; Fu, W.T.; Huang, S.W.; Bailey, B.P. Show Me the Money! An Analysis of Project Updates during Crowdfunding Campaigns. In Proceedings of the Conference on Human Factors in Computing Systems (CHI 2014), New York, NY, USA, 26 March–1 April 2014; pp. 591–600. [Google Scholar]
- Koch, J.A. Crowdfunding Success Factors: The Characteristics of Successfully Funded Projects on Crowdfunding Platforms. In Proceedings of the 23rd European Conference on Information Systems (ECIS 2015), Münster, Germany, 26–29 May 2015. [Google Scholar]
- Lai, C.Y.; Lo, P.C.; Hwang, S.Y. Incorporating comment text into success prediction of crowdfunding campaigns. In Proceedings of the 21st Pacific-Asia Conference on Information Systems (PACIS 2017), Langkawi Island, Malaysia, 16–20 July 2017. [Google Scholar]
- Zvilichovsky, D.; Inbar, Y.; Barzilay, O. Playing Both Sides of the Market: Success and Reciprocity on Crowdfunding Platforms. Available online: https://ssrn.com/abstract=2304101 (accessed on 21 June 2022).
- Rakesh, V.; Choo, J.; Reddy, C.K. Project recommendation using heterogeneous traits in crowdfunding. In Proceedings of the 9th International AAAI Conference on Web and Social Media (ICWSM 2015), Oxford, UK, 26–29 May 2015. [Google Scholar]
- Thies, F.; Wessel, M.; Rudolph, J.; Benlian, A. Personality matters: How signaling personality traits can influence the adoption and diffusion of crowdfunding campaigns. In Proceedings of the European Conference on Information Systems (ECIS 2016), Istanbul, Turkey, 12–15 June 2016. [Google Scholar]
- Moreno-Moreno, A.; Sanchís-Pedregosa, C.; Berenguer, E. Success factors in peer-to-business (P2B) crowdlending: A predictive approach. IEEE Access 2019, 7, 148586–148593. [Google Scholar] [CrossRef]
- Mitra, T.; Gilbert, E. The Language that Gets People to Give: Phrases that Predict Success on Kickstarter. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work (CSCW 2014), Baltimore, MD, USA, 15–19 February 2014; pp. 49–61. [Google Scholar]
- Gao, Q.; Lin, M. Lemon or Cherry? The Value of Texts in Debt Crowdfunding. Available online: https://cafin.ucsc.edu/research/work_papers/CAFIN_WP18.pdf (accessed on 21 June 2022).
- Agrawal, A.K.; Catalini, C.; Goldfarb, A. The Geography of Crowdfunding; Technical Report; National Bureau of Economic Research: Cambridge, MA, USA, 2011. [Google Scholar]
- Gera, J.; Kaur, H. Identifying Significant Features to Improve Crowd Funded Projects’ Success. In Innovations in Computer Science and Engineering; Saini, H.S., Sayal, R., Rawat, S.S., Eds.; Springer: Singapore, 2016; pp. 211–218. [Google Scholar]
- Etter, V.; Grossglauser, M.; Thiran, P. Launch Hard or Go Home!: Predicting the Success of Kickstarter Campaigns. In Proceedings of the 1st ACM Conference on Online Social Networks (COSN 2013), Boston, MA, USA, 7–8 October 2013; pp. 177–182. [Google Scholar]
- Lynn, T.; Rosati, P.; Nair, B.; Bhaird, C.M. An Exploratory Data Analysis of the #Crowdfunding Network on Twitter. J. Open Innov. Technol. Mark. Complex. 2020, 6, 80. [Google Scholar]
- Kim, K.; Viswanathan, S. The ‘Experts’ in the Crowd: The Role of Experienced Investors in a Crowdfunding Market. Available online: https://ssrn.com/abstract=2258243 (accessed on 21 June 2022).
- Mollick, E.; Nanda, R. Wisdom or madness? Comparing crowds with expert evaluation in funding the arts. Manag. Sci. 2015, 62. [Google Scholar] [CrossRef] [Green Version]
- An, J.; Quercia, D.; Crowcroft, J. Recommending Investors for Crowdfunding Projects. In Proceedings of the 23rd International Conference on World Wide Web (WWW 2014), New York, NY, USA, 7–11 April 2014; pp. 261–270. [Google Scholar]
- Gerber, E.M.; Hui, J.S.; Kuo, P.Y. Crowdfunding: Why people are motivated to post and fund projects on crowdfunding platforms. In Proceedings of the ACM Conference on Computer Supported Cooperative Work Companion (CSCW 2012), Seattle, WA, USA, 11–15 February 2012. [Google Scholar]
- Burgoon, J.K.; Buller, D.B.; Floyd, K.; Grandpre, J. Deceptive realities: Sender, receiver, and observer perspectives in deceptive conversations. Commun. Res. 1996, 23, 724–748. [Google Scholar] [CrossRef]
- Burgoon, J.K.; Blair, J.P.; Qin, T.; Nunamaker, J.F. Detecting deception through linguistic analysis. In Proceedings of the Intelligence and Security Informatics (ISI 2003), Tucson, AZ, USA, 2–3 June 2003. [Google Scholar]
- Toma, C.L.; Hancock, J.T. Reading between the lines: Linguistic cues to deception in online dating profiles. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW 2010), Savannah, GA, USA, 6–10 February 2010; pp. 5–8. [Google Scholar]
- Keila, P.S.; Skillicorn, D.B. Detecting unusual and deceptive communication in email. In Proceedings of the Centers for Advanced on Collaborative Research (CASCON 2005), Toronto, ON, Canada, 17–20 October 2005; pp. 117–125. [Google Scholar]
- Zhou, L.; Zhang, D. Following linguistic footprints: Automatic deception detection in online communication. Commun. ACM 2008, 51, 119–122. [Google Scholar] [CrossRef]
- Pennebaker, J.W.; Mehl, M.R.; Niederhoffer, K.G. Psychological aspects of natural language use: Our words, our selves. Annu. Rev. Psychol. 2003, 54, 547–577. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Humpherys, S.L.; Moffitt, K.C.; Burns, M.B.; Burgoon, J.K.; Felix, W.F. Identification of fraudulent financial statements using linguistic credibility analysis. Decis. Support Syst. 2011, 50, 585–594. [Google Scholar] [CrossRef]
- Zhou, L.; Burgoon, J.K.; Nunamaker, J.F.; Twitchell, D. Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Group Decis. Negot. 2004, 13, 81–106. [Google Scholar] [CrossRef]
- DePaulo, B.M.; Lindsay, J.J.; Malone, B.E.; Muhlenbruck, L.; Charlton, K.; Cooper, H. Cues to deception. Psychol. Bull. 2003, 129, 74. [Google Scholar] [CrossRef] [PubMed]
- Vrij, A. Detecting Lies and Deceit: The Psychology of Lying and the Implications for Professional Practice, 1st ed.; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
- Shafqat, W.; Lee, S.; Malik, S.; Kim, H.-c. The language of deceivers: Linguistic features of crowdfunding scams. In Proceedings of the 25th International Conference on World Wide Web (WWW 2016), Montreal, QC, Canada, 11–15 April 2016; pp. 99–100. [Google Scholar]
- Gao, Q.; Lin, M. Economic value of texts: Evidence from online debt crowdfunding. In Proceedings of the Conference on Financial Innovation: Online Lending to Households and Small Businesses, Washington, DC, USA, 3 December 2016. [Google Scholar]
- Siering, M.; Koch, J.-A.; Deokar, A.V. Detecting fraudulent behavior on crowdfunding platforms: The role of linguistic and content-based cues in static and dynamic contexts. J. Manag. Inf. Syst. 2016, 33, 421–455. [Google Scholar] [CrossRef]
- Cumming, D.; Hornuf, L.; Karami, M.; Schweizer, D. Disentangling Crowdfunding from Fraudfunding. J. Bus. Ethics 2021, 1–26. [Google Scholar] [CrossRef]
- Newman, M.L.; Pennebaker, J.W.; Berry, D.S.; Richards, J.M. Lying words: Predicting deception from linguistic styles. Personal. Soc. Psychol. Bull. 2003, 29, 665–675. [Google Scholar] [CrossRef]
- Cheng, Q.; Varshney, P.K.; Arora, M.K. Logistic regression for feature selection and soft classification of remote sensing data. IEEE Geosci. Remote Sens. Lett. 2006, 3, 491–494. [Google Scholar] [CrossRef]
- Zhou, L.; Burgoon, J.K.; Twitchell, D.P.; Qin, T.; Nunamaker, J.F. A comparison of classification methods for predicting deception in computer-mediated communication. J. Manag. Inf. Syst. 2004, 20, 139–166. [Google Scholar] [CrossRef]
- Abdi, N.; Zhan, X.; Ramokapane, K.M.; Such, J. Privacy norms for smart home personal assistants. In Proceedings of the Conference on Human Factors in Computing Systems (CHI 2021), Yokohama, Japan, 8–13 May 2021. [Google Scholar]
- Knapp, M.L.; Hart, R.P.; Dennis, H.S. An exploration of deception as a communication construct. Hum. Commun. Res. 1974, 1, 15–29. [Google Scholar] [CrossRef]
- Kleinberg, B.; Mozes, M.; Arntz, A.; Verschuere, B. Using named entities for computer-automated verbal deception detection. J. Forensic Sci. 2018, 63, 714–723. [Google Scholar] [CrossRef]
- Papantoniou, K.; Papadakos, P.; Patkos, T.; Flouris, G.; Androutsopoulos, I.; Plexousakis, D. Deception detection in text and its relation to the cultural dimension of individualism/collectivism. Nat. Lang. Eng. 2021, 28, 545–606. [Google Scholar] [CrossRef]
- Ott, M.; Choi, Y.; Cardie, C.; Hancock, J.T. Finding deceptive opinion spam by any stretch of the imagination. arXiv 2011, arXiv:1107.4557. [Google Scholar]
- Ott, M.; Choi, Y.; Cardie, C.; Hancock, J.T. Negative deceptive opinion spam. In Proceedings of the Conference on North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), Atlanta, GA, USA, 10–12 June 2013; pp. 497–501. [Google Scholar]
- Kuppuswamy, V.; Bayus, B.L. Crowdfunding creative ideas: The dynamics of project backers. In The Economics of Crowdfunding; Cumming, D., Hornuf, L., Eds.; Palgrave Macmillan: London, UK, 2018; pp. 151–182. [Google Scholar]
- Solomon, J.; Ma, W.; Wash, R. Don’t wait! How timing affects coordination of crowdfunding donations. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work (CSCW 2015), Vancouver, BC, Canada, 14–18 March 2015. [Google Scholar]
Quantity |
---|
1. (Total # of) words, adverbs, clauses, verbs, phrases, characters, punctuation, nouns, sentences, adjectives, noun phrases |
(a phrase consisting of a noun, its modifiers and determinants) |
Complexity |
2. Average # of clauses: total # of clauses/total # of sentences |
3. Average sentence length: total # of words/total # of sentences |
4. Average word length: total # of characters/total # of words |
5. Pausality: total # of punctuation marks/total # of sentences |
Non-immediacy |
6. Self reference: total # of first person singular pronouns |
7. Group reference: total # of first person plural pronouns |
Uncertainty |
8. Modal verbs: a verb that is usually used with another verb to express ideas such as possibility, necessity, and permission |
9. Other reference: total # of second and third person pronouns |
Expressiveness |
10. Emotiveness: total # of adjectives + total # of adverbs/total # of nouns + total # of verbs |
Diversity |
11. Lexical diversity: percentage of unique words (total # of different words/total # of words) |
Redundancy |
12. Redundancy: total # of function words/total # of sentences |
Informality |
13. Typo ratio: total # of misspelled words/total # of words |
Relativity |
14. Time: total # of time, e.g., hour, o’clock, evening, yesterday etc. |
15. Past, present and future tense verbs: total # of past, present and future tense verbs |
Scam | Non-Scam | |||||||
---|---|---|---|---|---|---|---|---|
SE | p-Value | Mean | SD | Mean | SD | |||
Creator | Existence of a link to a Facebook ID | −1.326 | 0.446 | ** | 0.350 | 0.480 | 0.550 | 0.499 |
Num. external links & websites | −0.665 | 0.169 | *** | 1.570 | 1.570 | 2.510 | 1.570 | |
Num. backed projects | −0.042 | 0.015 | ** | 8.740 | 15.327 | 22.550 | 34.270 | |
Num. created projects | −0.320 | 0.150 | * | 1.730 | 1.536 | 2.380 | 2.656 | |
Campaign | Redundancy | 0.206 | 0.128 | 0.108 | 5.367 | 2.819 | 4.887 | 1.699 |
Num. images | 0.060 | 0.021 | ** | 17.090 | 16.690 | 13.470 | 11.226 | |
Updates | Num. third person pronouns/Num. updates | 0.285 | 0.101 | ** | 3.653 | 2.943 | 4.353 | 5.091 |
Num. images/Num. updates | −0.488 | 0.222 | ** | 0.777 | 1.087 | 1.017 | 1.248 | |
Num. emails/Num. updates | −4.551 | 1.978 | * | 0.046 | 0.095 | 0.159 | 0.260 | |
Num. location/Num. updates | −1.585 | 0.402 | *** | 0.544 | 0.670 | 1.086 | 1.286 | |
Num. past tense verbs/Total words | −0.686 | 0.272 | * | 0.025 | 0.010 | 0.028 | 0.007 | |
Comments | Num. verbs/Num. creator comments | 0.835 | 0.140 | *** | 13.906 | 10.562 | 9.916 | 5.407 |
Num. sentences/Num. creator comments | −0.539 | 0.214 | * | 3.819 | 2.621 | 3.374 | 1.944 | |
Num. first person plural pronouns/Num. creator comments | −1.070 | 0.276 | *** | 1.799 | 1.791 | 1.726 | 1.179 | |
Num. second person pronouns/Num. creator comments | −1.068 | 0.339 | ** | 1.756 | 1.561 | 1.660 | 1.014 | |
Num. third person pronouns/Num. creator comments | −1.971 | 0.542 | *** | 1.310 | 1.056 | 1.071 | 0.831 | |
Num. present tense verbs/Total words | 0.151 | 0.076 | * | 0.119 | 0.028 | 0.115 | 0.024 |
Feature | Precision | Recall | Accuracy | AUC |
---|---|---|---|---|
Creator-related | 65.3% | 62.7% | 71.4% | 0.758 |
Campaign | 55.5% | 14.7% | 60.7% | 0.593 |
Updates | 62.5% | 63.7% | 69.8% | 0.752 |
Comments | 70.3% | 55.8% | 72.6% | 0.805 |
Full model | 84.3% | 84.3% | 87.3% | 0.939 |
“We might be a couple days behind schedule” |
“We could make it happen faster, but as we are having the game printed in china, it will take some time to get them literally shipped overseas after they are produced.” |
“We know it is a bummer you will not be able to play it on your computer right away, but we will still have it out for you by september 2013” |
“I know we can do it” |
“I can only tell you that i will use my best endeavors to make it happen.” |
Algorithm | Precision | Recall | Accuracy | AUC |
---|---|---|---|---|
Logistic regression | 84.3% | 84.3% | 87.3% | 0.939 |
Random Forest | 77.5% | 67.6% | 79.0% | 0.851 |
SVM | 68.4% | 63.7% | 73.4% | 0.719 |
Naive bayes | 61.8% | 71.5% | 70.6% | 0.734 |
KNN (k = 9) | 66.6% | 50.9% | 69.8% | 0.757 |
J48 Decision Tree | 58.4% | 57.8% | 66.3% | 0.660 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, S.; Shafqat, W.; Kim, H.-c. Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns. Sensors 2022, 22, 7677. https://doi.org/10.3390/s22197677
Lee S, Shafqat W, Kim H-c. Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns. Sensors. 2022; 22(19):7677. https://doi.org/10.3390/s22197677
Chicago/Turabian StyleLee, SeungHun, Wafa Shafqat, and Hyun-chul Kim. 2022. "Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns" Sensors 22, no. 19: 7677. https://doi.org/10.3390/s22197677
APA StyleLee, S., Shafqat, W., & Kim, H.-c. (2022). Backers Beware: Characteristics and Detection of Fraudulent Crowdfunding Campaigns. Sensors, 22(19), 7677. https://doi.org/10.3390/s22197677