The Web is evolving through an era where the opinions of users are getting increasingly important and valuable. The distillation of knowledge from the huge amount of unstructured information on the Web can be a key factor for tasks such as social media marketing, branding, product positioning, and corporate reputation management. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions involves a deep understanding of natural language text by machines, from which we are still very far.
Singapore Symposium on Sentiment Analysis (S3A) is a biennial event aiming to bridge such a gap by exploring novel approaches to opinion mining and sentiment analysis that enable a more efficient passage from (unstructured) textual information to (structured) machine-processable data. S3A aims to provide a national forum for Singapore-based researchers working in the field of sentiment analysis and related topics to share information on their latest investigations and their applications both in academic research areas and industrial sectors.
The broader context of the symposium comprehends AI, linguistics, psychology, sociology, and ethics. Topics of interest include but are not limited to:
• Sentiment identification & classification
• Opinion and sentiment summarization & visualization
• Social network analysis
• Social media marketing
• Cultural-dependent sentiment analysis
• Personality detection
• Aspect extraction for opinion mining
• Linguistic patterns for sentiment analysis
• Learning word dependencies in text
• Statistical learning theory for big social data analysis
• Deep learning for sarcasm detection
• Sentic computing
• Large commonsense graphs
• Conceptual primitives for sentiment analysis
• Multimodal emotion recognition and sentiment analysis
• Multi-domain & cross-domain evaluation
• Opinion spam detection
S3A'17 (17th February 2017, NTU)
Exact location: LT14 in North Spine (level 4, opposite SCSE)
10.00 – 10.15: Welcoming and introduction by Erik Cambria
10.15 – 11.00: Andrew Ortony (Northwestern University)
The Cognitive Structure of Emotions
What causes us to experience emotions? What makes emotions vary in intensity? How are different emotions related to one another and what are the information processing mechanisms and structures that underlie their elicitation and intensification? In his talk, Prof Ortony will address such questions by presenting a computationally tractable account of the cognitive antecedents of emotions based on three aspects of the world to which agents can react emotionally––events of concern to them, the actions of those they consider responsible for such events, and objects qua objects. He will show how these three classes of reactions lead to three classes of emotions, each based on evaluations in terms of different kinds of knowledge representations. He will also discuss various factors that influence the intensity of emotions while emphasizing the distinction between emotions themselves and the language we use to talk about them.
11.00 – 11.30: Bee Chin Ng (NTU School of Humanities and Social Sciences)
Feeling the Language You Speak
The view that language mediates the world of emotion has been robustly supported in both observations and empirical investigations. The issue of how bilinguals or multilinguals negotiate the emotional worlds of the different languages they speak is beginning to receive attention though the bulk of the studies focus mainly on crosslinguistic comparisons or on monolinguals. There is sufficient evidence, however, to point to bilinguals shifting their sociocultural views and expectations when the language they use changes. This is hardly surprising but what remains to be explored is precisely how these views change and whether these changes can be predicted by our current knowledge of emotion research about existing languages. In this talk I will present the findings from my current study of an emotion corpus in Mandarin Chinese as well as a crosslinguistic comparison of four emotion domains (Anger, Pride, Guilt and Shame). The findings indicate strong language effects and despite fairly homogenous cultural experience, language use and exposure play a big role in shaping Singaporean bilinguals’ use and understanding of emotion words. This key aspect of the findings challenges the common practice to see same language pair bilinguals as a homogenous bilingual group.
11.30 – 12.00: Andrea Nanetti (NTU School of Art, Design and Media)
One Belt, One Road, One Sentiment?
One Belt, One Road, One Sentiment? is a project in collaboration with Prof Andrea Nanetti, from NTU ADM, which aims to visualize the sentiment of the world towards President Xi's One Belt One Road initiative. Such an initiative is a $4-trillion development strategy and framework that focuses on connectivity and cooperation among 60 countries primarily between China and the rest of Eurasia and consists of two main components: the land-based Silk Road Economic Belt and ocean-going Maritime Silk Road. One Belt, One Road, One Sentiment? aims to collect and analyze the reactions of the different countries involved in President Xi's initiative in real-time. Many economies, in fact, are affected by the initiative, which has been welcomed by some countries but contrasted by some others, e.g., the supporters of trading arrangements such as the Trans-Pacific Partnership and the Transatlantic Trade and Investment Partnership. The project employs SenticNet technologies to collect and analyze news and social media in many different languages from across the globe and, hence, visualize the real-time sentiment of the world towards the One Belt One Road initiative in real-time in a 3D dome.
12.00 – 12.30: Jing Jiang (SMU School of Information Systems)
Learning Sentence Embeddings for Cross-Domain Sentiment Classification
Recent years have witnessed the promising results of applying neural network models in many natural language processing tasks, including sentiment analysis and opinion mining. However, as with any supervised learning method, neural network models cannot work well when the training data comes from a different domain than the test data. In this talk, I will present two recent pieces of work that borrow labels of auxiliary tasks to improve the cross-domain performance for sentiment classification and opinion target extraction. In both studies the auxiliary labels are used to improve the learning of the hidden representations of text in a multi-layer neural network.
12.30 – 13.30: Lunch break
13.30 – 14.00: Vibeke Sorensen (NTU School of Art, Design and Media)
Mood of Singapore
Mood of Singapore aims to visualize the emotions of the Little Red Dot in real time. Singapore-geolocated social data is collected and processed by SenticNet technologies and visualized according to the emotion-color mapping of the Hourglass of Emotions. The results of such mapping are displayed through an interactive sculpture, a dynamic architectural installation that has as its center-piece a large ‘arch’ or ‘doorway’ that emits colored light and animates in reflection of the live emotions expressed by people based in Singapore communicating through networks such as Twitter. It rethinks the term ‘public art’ in the context of social transmodal transmedia. The ‘arch’ or ‘doorway’ is iconic and references developmental transformation, the metaphoric passing from one state to another, of growth and change that is analogous to the transformative effect that communications technologies have upon our collective human condition. The arch also signifies human transformation of the environment, today both physical and digital, as this iconic form has been used across different cultures in Singapore. One installation of Mood of Singapore is currently operational at NTU Experimental Medicine Building.
14.00 – 14.30: Erik Cambria (NTU School of Computer Science and Engineering)
In this talk, I am going to reveal some of the magic behind Mood of Singapore and One Belt, One Road, One Sentiment? by presenting the latest version of SenticNet, a commonsense knowledge base for concept-level sentiment analysis. SenticNet encodes affective information in terms of semantics and sentics, i.e., the denotative and connotative information commonly associated with real-world objects, actions, events, and people. SenticNet steps away from blindly using keywords and word co-occurrence counts, and instead relies on the implicit meaning associated with commonsense concepts. Superior to purely syntactic techniques, SenticNet can detect subtly expressed sentiments by enabling the analysis of multiword expressions that do not explicitly convey emotion, but are instead related to concepts that do so.
14.30 – 15.00: Chris Khoo (NTU Wee Kim Wee School of Communication and Information)
Lexicon-Based Sentiment Analysis: Comparative Evaluation of Six Sentiment Lexicons
This talk describes a general-purpose sentiment lexicon called WKWSCI Sentiment Lexicon, and compares it with five existing lexicons: Hu & Liu Opinion Lexicon, MPQA Subjectivity Lexicon, General Inquirer, NRC Word-Sentiment Association Lexicon and SO-CAL lexicon. The effectiveness of the sentiment lexicons for sentiment categorization at the document-level and sentence-level was evaluated using an Amazon product review dataset and a news headlines dataset. WKWSCI, MPQA, Hu & Liu and SO-CAL lexicons are equally good for product review sentiment categorization, obtaining accuracy rates of 75% to 77% when appropriate weights are used for different categories of sentiment words. However, when a training corpus is not available, Hu & Liu obtained the best accuracy with a simple-minded approach of counting positive and negative words, for both document-level and sentence-level sentiment categorization. The WKWSCI lexicon obtained the best accuracy of 69% on the news headlines sentiment categorization task, and the sentiment strength values obtained a Pearson correlation of 0.57 with human assigned sentiment values.
15.00 – 15.30: Gao Cong (NTU School of Computer Science and Engineering)
A Unified Model for User Preference Analysis in Geo-tagged Reviews
Massive amounts of geo-textual data that contain both geospatial and textual content are being generated at an unprecedented scale from social media websites. Example user generated geo-textual content includes geo-tagged micro-blogs, photos with both tags and geo-locations in social photo sharing websites, as well as points of interest (POIs) and check-in information in location-based social networks. This talk presents a study of user preferences to POIs based on geo-tagged reviews. We study two types of user preferences to POIs: topical-region preference and category aware topical-aspect preference. We propose a unified probabilistic model to capture these two preferences simultaneously. In addition, our model is capable of capturing the interaction of different factors, including topical aspect, sentiment, and spatial information. The model can be used in a number of applications, such as POI recommendation and user recommendation, among others. In addition, the model enables us to investigate whether people like an aspect of a POI or whether people like a topical aspect of some type of POIs (e.g., bars) in a region, which offer explanation for recommendations.
15.30 – 16.00: Coffee break
16.00 – 16.20: Aixin Sun (NTU School of Computer Science and Engineering)
Domain-Specific Named Entity Recognition from Social Media
Named entity recognition (NER) is an important task in text analysis. NER from social media text is even more challenging because of the informal writing style and the lack of context in interpreting the content. In this talk, I will report two case studies on domain specific NER from social media. One study is to recognize mobile phone names from online forum and the other is to recognize fine-grained locations (e.g., restaurants, landmarks, and shopping malls) from tweets. For both tasks, we propose to build a dictionary in user language with noisy terms that might be a named entity or a part of a named entity. A classifier is then built to recognize the named entities and then link to their formal names. Our experiments on manually annotated data show promising results on both tasks.
16.20 – 16.40: Danyuan Ho (NTU Temasek Laboratories)
Singlish SenticNet is a concept-level resource for sentiment analysis in Singlish that provides the semantics and sentics (denotative and connotative information) associated with more than 5000 words and multiword expressions. These concepts are crowdsourced (e.g., through games) and encoded redundantly at three levels, namely: as a semantic network, as a matrix and as a vector space. Each representation is useful for a different kind of reasoning: the semantic network specifies the relationships between concepts and, hence, it is useful for tasks such as question answering; the matrix allows for the inference of new pieces of commonsense knowledge based on shared semantic features; finally, the vector space is a powerful tool for analogical reasoning. Singlish SenticNet is generated by the joint use of all three representations: this superior kind of reasoning, termed sentic panalogy, ensures that both the accuracy and efficiency of Singlish SenticNet are high by making the most of each representation at different times. In problem-solving situations, in fact, several analogous representations of the same problem should be maintained in parallel while trying to solve it so that, when problem-solving begins to fail while using one representation, the system can switch to one of the others.
16.40 – 17.00: Stefan Winkler (ADSC)
Fine-grained Emotion Analysis From Facial Expressions
Emotions are often conveyed by facial expressions. Contrary to most existing approaches in computer vision, we avoid the classification of emotions into a few predefined categories, like happy, sad, or surprised, and instead follow a dimensional paradigm as represented by the circumplex model of emotions. We cast the problem of emotion estimation as a regression problem. Based on the tracking of facial landmark points and relevant geometrical features, we directly estimate arousal, valence, and intensity of emotion. Our model is trained using a large database of face images from psychophysical validation studies. This approach more accurately encompasses the wide range of facial expressions, and is able to capture even subtle variations in emotion and intensity. We discuss the benefits and challenges of our method, and also present some of its applications.
17.00 – 17.20: Iti Chaturvedi (NTU School of Computer Science and Engineering)
Multiple Kernel Learning for Multimodal Sentiment Analysis
Technology has enabled anyone with an Internet connection to easily create and share their ideas, opinions and content with millions of other people around the world. Much of the content being posted and consumed online is multimodal. With billions of phones, tablets and PCs shipping today with built-in cameras and a host of new video-equipped wearables like Google Glass on the horizon, the amount of video on the Internet will only continue to increase. It has become increasingly difficult for researchers to keep up with this deluge of multimodal content, let alone organize or make sense of it. Mining useful knowledge from video is a critical need that will grow exponentially, in pace with the global growth of content. This is particularly important in sentiment analysis, as both service and product reviews are gradually shifting from unimodal to multimodal. We present a novel method to extract features from visual and textual modalities using deep convolutional neural networks. By feeding such features to a multiple kernel learning classifier, we significantly outperform the state of the art of multimodal sentiment analysis on different datasets.
17.20 – 18.00: Group discussion and final remarks
S3A'15 (6th February 2015, NTU)
10.00 – 10.15: Welcoming and introduction by Erik Cambria and Francis Bond
10.15 – 11.00: Amit Sheth (Ohio Center of Excellence in Knowledge-enabled Computing)
Citizen Sensor Data Mining, Social Media Analytics and Applications
With the rapid rise in the popularity of social media (1B+ Facebook users, 200M+ twitter users), and near ubiquitous mobile access (4+ billion actively-used mobile phones), the sharing of observations and opinions has become common-place (500M+ tweets a day). This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it for brand tracking and management, crisis coordination, organizing revolutions or promoting social development in underdeveloped and developing countries. I will review: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) how we built Twitris, a comprehensive social media analytics (social intelligence) platform. I will describe the analysis capabilities along three dimensions: spatio-temporal-thematic, people-content-network, and sentiment-emption-intent. I will couple technical insights with identification of computational techniques and real-world examples using live demos of Twitris (http://analysis.knoesis.org)
11.00 – 11.45: Tomoko Ohkuma (Fuji-Xerox)
Sentiment Analysis and User Profiling for SNS Text
The NLP team in the Communication Technology Laboratory is working on research and development of information extraction from SNS text. In this presentation, we introduce research activities about sentiment analysis and user profiling for applications like social listening, reputation management, and marketing. Topics that will be presented are 1) a report of SemEval-2014, 2) sentiment analysis using WSD, 3) targeted sentiment using topic modeling, 4) user gender inference using text and image processing. At the end of this presentation, we talk about a new joint research project that just started between NTU and Fuji Xerox in this February.
11.45 – 12.15: Waifong Boh (NTU Nanyang Business School)
A Temporal Study of the Effects of Online Opinions: Information Sources Matter
This study examines when and why online comments from different sources and platforms influence a movie's box office receipts over time. We tracked over 1,500 sources of online expert and consumer reviews for cinematic movies released for an entire year and continuously monitored major social media sites (e.g. Twitter and Plurk) for comments. We text-mined the comments to elucidate the sentiments and analyzed the data. Premised on the argument that greater uncertainty exists at the beginning of a movie's release, we hypothesized and found that expert reviews, and the valence and volume of comments from pull-based platforms like forums have a significant influence on early box office receipts. In contrast, the valence and volume of comments from push-based platforms like microblogs have a significant influence on later box office receipts, as they serve a reminder rather than an informational role with the decreased uncertainty in these later stages. Our research demonstrates that online opinions are not always persuasive and useful, and our findings provide insights into when consumers are likely to pay attention to which types of online opinions.
12.15 – 12.45: Feida Zhu (SMU School of Information Systems)
Social Media Mining and Analysis for Financial Innovation
The recent blossom of social network services has provided everyone with an unprecedented level of ease and fun of sharing information of all sorts. These public social data therefore reveal a surprisingly large amount of information about an individual which is otherwise unavailable. The business, consumer and social insights attainable from this big and dynamic social data are critically important and immensely valuable in a wide range of applications for both private and public sectors. In particular, there has been a growing interest in harnessing social media data for financial innovation. In this talk, we will explore some recent advances along this direction including personal credit scoring, risk management and customer acquisition.
12.45 – 14.00: Lunch break
14.00 – 14.30: Chris Khoo (NTU Wee Kim Wee School of Communication and Information)
Comparison of Lexical Resources for Sentiment Analysis
This work sets out a detailed comparison of sentiment lexica (General inquirer, MPQA and Hu & Liu) with WKWSCI lexicon. WKWSCI lexicon contains human annotated words with semantic orientation (polarity and strength). The presentation will provide an overview of the coverage of WKWSCI lexicon, overlap and consistency with other lexicons. We also show lexicon performance in product reviews dataset using bag of words approach.
14.30 – 15.00: Elvis Albertus Bin Toni (NTU School of Humanities and Social Sciences)
Linguistic Expression of Emotions in Lamaholot Language
This study observed the syntactical differences across dialects, metaphors, and borrowing from and/or mixing with other language for linguistic expression of emotions in Lamaholot language. It displays several findings that there are two distinctive syntactical features i.e. the existence of pronoun subject in the expression of emotions and the use of single combination of morphemes across three investigated dialects (Nusa Tadon, Lewo Tobi, and Lewolema). That a metaphor is a vehicle for expression of emotion attested in the three dialects. That ‘One-k’/my heart as a feature of expression of emotion in Lamaholot is shared among the dialects. That borrowing from and/or mixing with Bahasa Indonesia when expressing emotion is common.
15.00 – 15.30: Iti Chaturvedi (NTU School of Computer Science and Engineering)
Deep Recurrent Neural Networks for Sentiment Analysis
The rise in social media such as blogs and networking websites has resulted in a surge of research in sentiment classification, which aims to determine the judgment of a writer with respect to a given topic based on a given textural comment. The objective is to classify the sentiment polarity of a tweet as positive, negative, or neutral. We propose use of a deep neural network to automatically extract sentiment specific word embedding from tweets. To capture loops and higher-order dependencies in a sequence of words we use Gaussian Bayesian networks. Low dimensional statistically significant word-structures called motifs are extracted from a variety of sources of data.
15.30 – 16.00: Francis Bond (NTU School of Humanities and Social Sciences)
Multi-Lingual Semantic Processing
With physical barriers to information access decreasing, lack of understanding become the greatest impediment to communication. Research on deep linguistic analysis allows us to abstract away from language particular syntactic phenomena to a uniform panlingual semantic representation. By linking this to WordNet, we can take advantage of a wide variety of linked open data, including sentiment and apply it to hundreds of languages.
16.00 – 16.30: Erik Cambria (NTU School of Computer Science and Engineering)
Sentic patterns merge linguistics, commonsense computing, and machine learning for improving the accuracy of sentiment-analysis tasks such as polarity detection. Sentic patterns allow sentiments to flow from concept to concept based on the dependency relation of the input sentence, like in an electronic circuit where sentiment words are sources while other words are elements, e.g., VERY is an amplifier, NOT is a logical complement, RATHER is a resistor, BUT is an OR-like element that gives preference to one of its inputs. This way, sentic patterns achieve a better understanding of the contextual role of each concept within the sentence and, hence, obtain a polarity detection accuracy that outperforms state-of-the-art statistical methods.
16.30 – 17.00: Group discussion and final remarks
S3A'13 (1st November 2013, NTU)
Location: HSS Seminar Room 3
13.00 – 13.10: Welcoming and introduction
13.10 – 13.30: Grégoire Winterstein (Hong Kong Institute of Education)
Argumentative Operators and Sentiment Analysis
I will provide a brief characterization of the notion of argumentation as it is understood in psychology and linguistics. I will then proceed to show how some linguistic items can best be described in argumentative terms. I will focus on the contributions of 'only', and 'almost'. In a second part I will underline the possible uses of argumentative theories for sentiment analysis and the insights argumentative theories can gather from the output of sentiment analysis models.
13.30 – 13.50: Hai Zhen (NTU School of Computer Science and Engineering)
Product Review Mining
My talk will focus on product review mining, as briefly summarized below 1. Introduction to review mining (opinion mining, sentiment analysis): background, motivation, introduction 2. Review mining at document (review), sentence, or phrase level 3. Feature-level review mining 3.1 feature extraction 3.1.1 explicit feature 3.1.2 implicit feature 3.2 opinion word identification and sentiment polarity classification 3.3 summarization 4. Aspect-based review mining (mainly discuss Topic Models) 4.1 aspect detection 4.1 sentiment prediction 5. review helpfulness prediction and review selection 6. Experiments 7. Conclusion
13.50 – 14.10: Lin Qiu (NTU School of Humanities and Social Sciences)
Personality Analysis over Twitter
Microblogging services such as Twitter have become increasingly popular in recent years. However, little is known about how personality is manifested and perceived in microblogs. In this study, we measured the Big Five personality traits of 142 participants and collected their tweets over a 1-month period. Extraversion, agreeableness, openness, and neuroticism were associated with specific linguistic markers, suggesting that personality manifests in microblogs. Meanwhile, eight observers rated the participants’ personality on the basis of their tweets. Results showed that observers relied on specific linguistic cues when making judgments, and could only judge agreeableness and neuroticism accurately. This study provides new empirical evidence of personality expression in naturalistic settings, and points to the potential of utilizing social media for personality research.
14.10 – 14.30: Chris Khoo (NTU Wee Kim Wee School of Communication and Information)
Sentiment Analysis of Movie Reviews, Drug Reviews and Political News
The talk summarizes 3 studies on the sentiment analysis of movie reviews, drug reviews and political news. The first study analyzed the differences in sentiment expressions used in movie reviews from four Web genres—blog postings, discussion board threads, user reviews, and reviews by movie critics. Sentiment analysis of movie reviews was performed at the clause level to identify the sentiment orientation and strength towards different aspects of a movie. A method was developed to compute the overall sentiment of a clause based on the sentiment scores of individual words, taken from sentiment lexicons. A visual interface was developed to explore the extracted sentiments. More recently, a similar sentiment analysis approach was applied to drug reviews. The third study was a case study of applying the Appraisal Theory developed by linguists to analyze political news articles.
14.30 – 14.50: Coffee break
14.50 – 15.10: Bai Lin (NTU School of Humanities and Social Sciences)
Communicating Emotions across Cultures
In our increasingly interconnected world, how to communicate across different cultures has become more critical. However, successful communications are always hindered by differences between languages and cultures, and such difficulties become even more obvious when it comes to more personal and emotional topics. How do people from diverse culture backgrounds communicate their emotions? In what ways does an expression of emotion vary across culture? How bilinguals meet with the challenges of cultural or linguistic specificities of their two languages? Can these cultural knowledge of emotion expression be taught?
15.10 – 15.30: Guang-Bin Huang (NTU School of Electrical & Electronic Engineering)
Representational Learning with Extreme Learning Machine for Big Data
Neural networks (NN) and support vector machines (SVM) play key roles in machine learning and data analysis in the past 2-3 decades. However, it is known that these popular learning techniques face some challenging issues such as: intensive human intervene, slow learning speed, poor learning scalability. Extreme Learning Machines (ELM) not only learn up to tens of thousands faster than NN and SVMs, but also provide unified implementation for regression, binary and multi-class applications. This talk will give a brief introduction to ELM history and some of its successful applications. This talk will further address three issues: i) why NN and SVM/LS-SVM may only produce suboptimal solutions to ELM; ii) why ELM may outperform Deep Learning in both learning accuracy and learning speed; and iii) why ELM could be a biological inspired learning technique and why ELM is closer to animal brains.
15.30 – 15.50: Francis Bond (NTU School of Humanities and Social Sciences)
Uniform Cross-lingual Sentiment analysis with WordNets
Semantically annotated corpora play an important role in natural language processing. This talk presents the results of a pilot study on building a sense-tagged parallel corpus, part of ongoing construction of aligned corpora for four languages (English, Chinese, Japanese, and Indonesian) in four domains (story, essay, news, and tourism) from the NTU-Multilingual Corpus. Each subcorpus is first sensetagged using a WordNet and then these synsets are linked. Upon the completion of this project, all annotated corpora will be made freely available. The multilingual corpora are designed to not only provide data for NLP tasks like machine translation, but also to contribute to the study of translation shift and bilingual lexicography as well as the improvement of monolingual WordNets.
15.50 – 16.10: Erik Cambria (NUS Temasek Labs)
Jumping NLP Curves
Natural language processing (NLP) is a theory-motivated range of computational techniques for the automatic analysis and representation of human language. NLP research has evolved from the era of punch cards and batch processing (in which the analysis of a sentence could take up to 7 minutes) to the era of Google and the likes of it (in which millions of webpages can be processed in less than a second). This presentation draws on recent developments in NLP research to look at the past, present, and future of NLP technology in a new light. Borrowing the paradigm of ‘jumping curves’ from the field of business management and marketing prediction, this talk reinterprets the evolution of NLP research as the intersection of three overlapping curves-namely Syntactics, Semantics, and Pragmatics Curves- which will eventually lead NLP research to evolve into natural language understanding.
16.10 – 16.30: Group discussion and final remarks