Detecting indicators for startup business success: sentiment analysis using text data mining

Jose Ramon Saura, Pedro Palos-Sanchez, Antonio Grilo

Research output: Contribution to journalArticle

10 Citations (Scopus)
2 Downloads (Pure)

Abstract

The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technologybased startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels' requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator's and startup's geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.

Original languageEnglish
Article number917
JournalSustainability (Switzerland)
Volume11
Issue number3
DOIs
Publication statusPublished - 11 Feb 2019

Fingerprint

business success
text analysis
data mining
Plant startup
Data mining
twitter
methodology
job offer
Industry
programming language
new technology
social network
Computer programming languages
Learning systems
software
learning
analysis
indicator
modeling
allocation

Keywords

  • Sentiment analysis
  • Startups business
  • Sustainable startups
  • Technology management
  • Text data mining

Cite this

@article{570cf89495054379b846f7621fc3b4ad,
title = "Detecting indicators for startup business success: sentiment analysis using text data mining",
abstract = "The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technologybased startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels' requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator's and startup's geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.",
keywords = "Sentiment analysis, Startups business, Sustainable startups, Technology management, Text data mining",
author = "Saura, {Jose Ramon} and Pedro Palos-Sanchez and Antonio Grilo",
year = "2019",
month = "2",
day = "11",
doi = "10.3390/su11030917",
language = "English",
volume = "11",
journal = "Sustainability (Switzerland)",
issn = "2071-1050",
publisher = "MDPI AG",
number = "3",

}

Detecting indicators for startup business success: sentiment analysis using text data mining. / Saura, Jose Ramon; Palos-Sanchez, Pedro; Grilo, Antonio.

In: Sustainability (Switzerland), Vol. 11, No. 3, 917, 11.02.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Detecting indicators for startup business success: sentiment analysis using text data mining

AU - Saura, Jose Ramon

AU - Palos-Sanchez, Pedro

AU - Grilo, Antonio

PY - 2019/2/11

Y1 - 2019/2/11

N2 - The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technologybased startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels' requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator's and startup's geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.

AB - The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technologybased startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels' requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator's and startup's geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.

KW - Sentiment analysis

KW - Startups business

KW - Sustainable startups

KW - Technology management

KW - Text data mining

UR - http://www.scopus.com/inward/record.url?scp=85061513479&partnerID=8YFLogxK

U2 - 10.3390/su11030917

DO - 10.3390/su11030917

M3 - Article

VL - 11

JO - Sustainability (Switzerland)

JF - Sustainability (Switzerland)

SN - 2071-1050

IS - 3

M1 - 917

ER -