Social Media Sentiment Analysis Software for an Analytical Agency

Social Media Sentiment Analysis Software for an Analytical Agency

Information
Region:
Warsaw, Poland
Industry:
Media and Entertainment
Type:
Web
Engagement model:
Fixed Price
Duration:
1 month
Staff:
3 developers
ID:
375
Technologies used
Keras
Pandas
NumPy
Python
Tweepy
Gensim
Morfeusz
Scikit-learn
Matplotlib
Front-end
JSON

Project Background

Elinext was contacted by an analytical agency from Poland and was asked to create a sentiment analysis software that would analyze emotions in Polish tweets about the elections. The client wanted to download tweets by keywords (for ex: the name of a party) and evaluate the emotional reaction on a party and its key players over a certain period of time (day, week, month, etc.). Also, the client wanted to be able to identify certain words of Twitter users that could characterize a party’s activity. In this way, the analytical agency would be able to get a better understanding of what forms a party’s ranking: what should be done to improve it and what should be avoided (events, actions, words, connections, etc.).

Challenges

Elinext teams faced a challenging task to develop a solution that would allow sentiment analysis in Twitter, providing our client with the ability to receive insightful information on how Twitter users react to certain politicians, their actions, speeches, etc., and then act accordingly.

Project Description

The project outsourced to Elinext was divided into the following segments of the tweets analysis process:

Each of these steps involved different technologies and approaches described further below.

Development Process

As we already mentioned, the development process was divided into three main stages:

Getting Data

Our development team ensured that the software under development is connected to Twitter. Right after, we extracted tweet objects of our client’s interest (by certain keywords and required time intervals), so our solution would be used on a regular basis and allow getting insights into the dynamics of political preferences in Poland during and after elections. It was created to be an everyday tool for Polish political analysts.

Preparing Data

We took advantage of JSON and Pandas to transform extracted tweet objects. To prepare the tweets for their further analysis, we set up a process that excludes words that have no real semantic value (prepositions, interjections, etc.) and separates references to other Twitter accounts.

Analyzing Data

To ensure effective analysis of the remaining text, two dictionaries were used: National Corpus of Polish presented in Google’s word2vec format and PLWordnet. The first one allows Natural Language Processing (NLP) with vector representation for the Polish language dictionary. This was based on word positions in vast amounts of texts. The second includes dictionaries of Polish words with positive and negative connotations.

  • National Corpus of Polish dictionary was read with Gensim library to get word2vec model.
  • PLWordnet dictionary is downloadable as XML-file which was parsed with the ElementTree XML API and filtered with regular expressions.

In order to reveal the clusters of the Polish electorate, the tweets cluster analysis was added. To provide a clear representation of the analyzed data, we added a data visualization option of clusters in 2d and 3d that was based on PCA dimensionality reduction technique.

Technologies

  • Python
  • Keras
  • Pandas
  • NumPy
  • Tweepy
  • JSON
  • Gensim
  • Morfeusz
  • Scikit-learn
  • Matplotlib

Features

  • Tweets extraction by keywords, time intervals, etc.
  • Tweet object transformation into JSON and Pandas data frames
  • Generation of analysis outputs in .csv and .xls formats
  • Text cleaning from words without semantic burden (prepositions, interjections, etc.), stop words, text tokenization
  • Natural Language Processing
  • XML-file parsing and strings filtering with regular expressions
  • Text-to-vector transformation tweets cluster analysis
  • Dimensionality reduction with Principal Component Analysis
  • Data visualization
  • Identification of the most frequently used words with their transformation to the initial form
  • Identification of words as the parts of speech
  • Calculation of frequency of occurrence in tweets and average sentiment scores for all verbs and nouns (common and proper names separately), and Twitter accounts mentioned in tweets texts (e.g., Twitter accounts of politicians)
  • Identification of Twitter audience’s positive or negative attitude towards some party, politician, event, etc.

Results

Elinext team successfully created a software solution that quickly performs analysis of tweets in line with certain criteria, providing the client with insightful information based on the sentiment analysis. With the help of our software, the Polish analytical agency can understand the public attitude towards political parties, their leaders or players, their speeches, or some events. With the received information, it is possible to find out which actions or words form the public attitude, as well as to see which words or phrases used by Twitter users are linked to some party or its player, and take appropriate actions and measures to improve the image. It is worthy of mentioning, that despite being useful in politics, our software solution can also work for marketers, retailers, sociologists, and other professionals working with people’s opinions.

Harnessing Technology for Smarter Social Media and Content Engagement

Hashtag Barometer Allin1Social – SMM Platform Location-Based Social Network App Trendify App Revenue Analysis App for YouTube Content Creators Music Streaming App Hate Speech Detector and FAQ Chatbot KartinaTV
clusters1-2
Do you want the same project?
Got A Project Idea? Lets Discuss It With Us
Contact Us

    • United States+1
    • United Kingdom+44
    • Afghanistan (‫افغانستان‬‎)+93
    • Albania (Shqipëri)+355
    • Algeria (‫الجزائر‬‎)+213
    • American Samoa+1
    • Andorra+376
    • Angola+244
    • Anguilla+1
    • Antigua and Barbuda+1
    • Argentina+54
    • Armenia (Հայաստան)+374
    • Aruba+297
    • Ascension Island+247
    • Australia+61
    • Austria (Österreich)+43
    • Azerbaijan (Azərbaycan)+994
    • Bahamas+1
    • Bahrain (‫البحرين‬‎)+973
    • Bangladesh (বাংলাদেশ)+880
    • Barbados+1
    • Belarus (Беларусь)+375
    • Belgium (België)+32
    • Belize+501
    • Benin (Bénin)+229
    • Bermuda+1
    • Bhutan (འབྲུག)+975
    • Bolivia+591
    • Bosnia and Herzegovina (Босна и Херцеговина)+387
    • Botswana+267
    • Brazil (Brasil)+55
    • British Indian Ocean Territory+246
    • British Virgin Islands+1
    • Brunei+673
    • Bulgaria (България)+359
    • Burkina Faso+226
    • Burundi (Uburundi)+257
    • Cambodia (កម្ពុជា)+855
    • Cameroon (Cameroun)+237
    • Canada+1
    • Cape Verde (Kabu Verdi)+238
    • Caribbean Netherlands+599
    • Cayman Islands+1
    • Central African Republic (République centrafricaine)+236
    • Chad (Tchad)+235
    • Chile+56
    • China (中国)+86
    • Christmas Island+61
    • Cocos (Keeling) Islands+61
    • Colombia+57
    • Comoros (‫جزر القمر‬‎)+269
    • Congo (DRC) (Jamhuri ya Kidemokrasia ya Kongo)+243
    • Congo (Republic) (Congo-Brazzaville)+242
    • Cook Islands+682
    • Costa Rica+506
    • Côte d’Ivoire+225
    • Croatia (Hrvatska)+385
    • Cuba+53
    • Curaçao+599
    • Cyprus (Κύπρος)+357
    • Czech Republic (Česká republika)+420
    • Denmark (Danmark)+45
    • Djibouti+253
    • Dominica+1
    • Dominican Republic (República Dominicana)+1
    • Ecuador+593
    • Egypt (‫مصر‬‎)+20
    • El Salvador+503
    • Equatorial Guinea (Guinea Ecuatorial)+240
    • Eritrea+291
    • Estonia (Eesti)+372
    • Eswatini+268
    • Ethiopia+251
    • Falkland Islands (Islas Malvinas)+500
    • Faroe Islands (Føroyar)+298
    • Fiji+679
    • Finland (Suomi)+358
    • France+33
    • French Guiana (Guyane française)+594
    • French Polynesia (Polynésie française)+689
    • Gabon+241
    • Gambia+220
    • Georgia (საქართველო)+995
    • Germany (Deutschland)+49
    • Ghana (Gaana)+233
    • Gibraltar+350
    • Greece (Ελλάδα)+30
    • Greenland (Kalaallit Nunaat)+299
    • Grenada+1
    • Guadeloupe+590
    • Guam+1
    • Guatemala+502
    • Guernsey+44
    • Guinea (Guinée)+224
    • Guinea-Bissau (Guiné Bissau)+245
    • Guyana+592
    • Haiti+509
    • Honduras+504
    • Hong Kong (香港)+852
    • Hungary (Magyarország)+36
    • Iceland (Ísland)+354
    • India (भारत)+91
    • Indonesia+62
    • Iran (‫ایران‬‎)+98
    • Iraq (‫العراق‬‎)+964
    • Ireland+353
    • Isle of Man+44
    • Israel (‫ישראל‬‎)+972
    • Italy (Italia)+39
    • Jamaica+1
    • Japan (日本)+81
    • Jersey+44
    • Jordan (‫الأردن‬‎)+962
    • Kazakhstan (Казахстан)+7
    • Kenya+254
    • Kiribati+686
    • Kosovo+383
    • Kuwait (‫الكويت‬‎)+965
    • Kyrgyzstan (Кыргызстан)+996
    • Laos (ລາວ)+856
    • Latvia (Latvija)+371
    • Lebanon (‫لبنان‬‎)+961
    • Lesotho+266
    • Liberia+231
    • Libya (‫ليبيا‬‎)+218
    • Liechtenstein+423
    • Lithuania (Lietuva)+370
    • Luxembourg+352
    • Macau (澳門)+853
    • Macedonia (FYROM) (Македонија)+389
    • Madagascar (Madagasikara)+261
    • Malawi+265
    • Malaysia+60
    • Maldives+960
    • Mali+223
    • Malta+356
    • Marshall Islands+692
    • Martinique+596
    • Mauritania (‫موريتانيا‬‎)+222
    • Mauritius (Moris)+230
    • Mayotte+262
    • Mexico (México)+52
    • Micronesia+691
    • Moldova (Republica Moldova)+373
    • Monaco+377
    • Mongolia (Монгол)+976
    • Montenegro (Crna Gora)+382
    • Montserrat+1
    • Morocco (‫المغرب‬‎)+212
    • Mozambique (Moçambique)+258
    • Myanmar (Burma) (မြန်မာ)+95
    • Namibia (Namibië)+264
    • Nauru+674
    • Nepal (नेपाल)+977
    • Netherlands (Nederland)+31
    • New Caledonia (Nouvelle-Calédonie)+687
    • New Zealand+64
    • Nicaragua+505
    • Niger (Nijar)+227
    • Nigeria+234
    • Niue+683
    • Norfolk Island+672
    • North Korea (조선 민주주의 인민 공화국)+850
    • Northern Mariana Islands+1
    • Norway (Norge)+47
    • Oman (‫عُمان‬‎)+968
    • Pakistan (‫پاکستان‬‎)+92
    • Palau+680
    • Palestine (‫فلسطين‬‎)+970
    • Panama (Panamá)+507
    • Papua New Guinea+675
    • Paraguay+595
    • Peru (Perú)+51
    • Philippines+63
    • Poland (Polska)+48
    • Portugal+351
    • Puerto Rico+1
    • Qatar (‫قطر‬‎)+974
    • Réunion (La Réunion)+262
    • Romania (România)+40
    • Russia (Россия)+7
    • Rwanda+250
    • Saint Barthélemy+590
    • Saint Helena+290
    • Saint Kitts and Nevis+1
    • Saint Lucia+1
    • Saint Martin (Saint-Martin (partie française))+590
    • Saint Pierre and Miquelon (Saint-Pierre-et-Miquelon)+508
    • Saint Vincent and the Grenadines+1
    • Samoa+685
    • San Marino+378
    • São Tomé and Príncipe (São Tomé e Príncipe)+239
    • Saudi Arabia (‫المملكة العربية السعودية‬‎)+966
    • Senegal (Sénégal)+221
    • Serbia (Србија)+381
    • Seychelles+248
    • Sierra Leone+232
    • Singapore+65
    • Sint Maarten+1
    • Slovakia (Slovensko)+421
    • Slovenia (Slovenija)+386
    • Solomon Islands+677
    • Somalia (Soomaaliya)+252
    • South Africa+27
    • South Korea (대한민국)+82
    • South Sudan (‫جنوب السودان‬‎)+211
    • Spain (España)+34
    • Sri Lanka (ශ්‍රී ලංකාව)+94
    • Sudan (‫السودان‬‎)+249
    • Suriname+597
    • Svalbard and Jan Mayen+47
    • Sweden (Sverige)+46
    • Switzerland (Schweiz)+41
    • Syria (‫سوريا‬‎)+963
    • Taiwan (台灣)+886
    • Tajikistan+992
    • Tanzania+255
    • Thailand (ไทย)+66
    • Timor-Leste+670
    • Togo+228
    • Tokelau+690
    • Tonga+676
    • Trinidad and Tobago+1
    • Tunisia (‫تونس‬‎)+216
    • Turkey (Türkiye)+90
    • Turkmenistan+993
    • Turks and Caicos Islands+1
    • Tuvalu+688
    • U.S. Virgin Islands+1
    • Uganda+256
    • Ukraine (Україна)+380
    • United Arab Emirates (‫الإمارات العربية المتحدة‬‎)+971
    • United Kingdom+44
    • United States+1
    • Uruguay+598
    • Uzbekistan (Oʻzbekiston)+998
    • Vanuatu+678
    • Vatican City (Città del Vaticano)+39
    • Venezuela+58
    • Vietnam (Việt Nam)+84
    • Wallis and Futuna (Wallis-et-Futuna)+681
    • Western Sahara (‫الصحراء الغربية‬‎)+212
    • Yemen (‫اليمن‬‎)+967
    • Zambia+260
    • Zimbabwe+263
    • Åland Islands+358

    Insert math as
    Block
    Inline
    Additional settings
    Formula color
    Text color
    #333333
    Type math using LaTeX
    Preview
    {}
    Nothing to preview
    Insert