Semantic Web - The Voice of Semantic Web BusinessWeb 3.0SemanticWeb100

Everything I Need To Know About Sentiment Analysis...

sentimentanalysis.png ..I learned at the Sentiment Analysis Symposium this week … well, okay, not everything. There is a lot going on in this burgeoning space, and the conference did a good job of grounding the issues. Among some of the more interesting discussions/ insights:

● Conference organizer Seth Grimes explained that two humans will only agree on expressed sentiment in 82 percent of cases. Expect machines to do better? No, and yes. Estimates of sentiment analysis accuracy can range from 70 to above 90 percent. Seventy percent wouldn’t have been high enough to help global public relations firm Porter Novelli help one client conclude that it shouldn’t walk away from its multi-million dollar investment in Tiger Woods as a spokesperson post-scandal, says Brad McCormick, Executive Vice President and Director of Digital at the company. But getting to those 90 percent plus levels means you need a huge volume of data – about 150 on-topic conversations per day every day – and rarely is there that level of conversation around most topics (Ed note sex, drugs and rock and roll excepted).

Other observations were that to get much beyond 80 percent accuracy, especially when the content set is broad, likely requires a huge investment and in most cases returns are diminished.


The point isn’t to get 100 percent accuracy so much as it is to identify outliers, says Greg Radner, global head of PR services, Thomson Reuters. “I don’t think it’s worth the effort to go beyond 80 percent because this is directional anyway,” he noted. Bradley Honan, senior vice president of StrategyOne, isn’t bothered by a 70 percent accurate take on sentiment – “if we figure 30 to 40 percent of people are actively engaged putting content online using social media, then we’re talking about 70 percent of that 30 or 40 percent,” and 70 percent of a small pie of what are probably the most partisan (positive or negative) individuals anyway is pretty good. As a method of capturing the sentiment of those to the right and left on the intensity curve, whose conversations can reverberate and influence, it’s a good arrow in a quiver that also has to include surveys, polling and so on to really understand the customer base, he said.

● What the heck do accuracy percentages mean anyway? From the audience came questions of how to interpret these claims better. If I understand Claire Cardie of Cornell University, it’s about precision, recall and F measure for document sentiment analytics systems. If sentiment analytics technology gets it right on positive, negative or neutral on 3 of 4 documents, its precision rates .75. And that can change based on the number of documents it tries to label – for instance, if it leaves out that one it mistook its precision goes up, so you can get a high precision level if it analyzes only for what it is ery certain about. Recall is the number of correct answers divided by the number of documents that have to be labeled – no skipping allowed – so the recall in this case is 3 out of 4. The F measure is the average of precision and recall but gives higher scores when precision and recall results are closer together. And how well this all works depends on what is being analyzed, anyway (news is harder to analyze than movie or product reviews, for instance, and shorter documents harder than longer ones.) So, to take us back to where we began with all this, a 75 to 85 percent F measure would be considered good in research, “because the comparison is not to a system getting 100 percent precision or recall but compared to the alternative of a person making that decision on each document.”

● SAP and SAS have the reach to be well-positioned to help close the loop internally between what you can do with sentiment analytics around internal structured and external unstructured data. (The point was made offline by a gentleman from a smaller startup.) For instance, Bernard Chung, Director of CRM Product Marketing, SAP Labs, LLC, SAP, showed how sentiment analytics can be used to enhance enterprise business processes. The vision is around marrying traditional enterprise customer data an organization may have with the social media over which customers express their feelings to gain a holistic picture of the customer, and then extend traditional business processes to incorporate social media with the customer interaction strategy to optimize their experience. He showed SAP’s Twitter service solution for service managers –using the Twitter API service managers could pull in relevant tweets, supported by SAP’s Business Objects’ text analytics solution, filter them by sentiment scores, create a service ticket for customers having problems from the enterprise environment and also send a twitter direct message to the customer that SAP is aware of and will be addressing the issue, including the customer service ticket number.

● Lot of attention, obviously, on sentiment analysis in Tweets – not surprising, given how open Twitter is with the data and how easy it is to collect vs. blogs and other news articles that have to be scraped and stripped. The audience got to see a young Stanford University’s student Twitter Sentiment solution (in an interesting if a bit of an inexperienced showing) and a quick and more polished look at Tweetsentiments.com, an NLP and machine learning based solution available in a full international edition that in real time can calculate sentiments by country, as well as particular user or topic, and comes with an API for integrating into your own application.

● What do the Feds have to do with your sentiment analytics efforts? Turns out that if you’re in pharma, you might not actually want to know to the individual level who’s having a side effect from what drug you produce; while there is still a little vagueness about how this all plays out in the social media world, federal regulations state that pharma companies that hear about a side effect are liable to report it, if you have a recognized patient with a recognized drug with a recognized side effect in a recognized location. That last one might left a little breathing room but now that Twitter lets you add locations to your Tweets, that squeezes things a bit more from that source. In the olden days (last year or so), apparently only 1 in 500 adverse events were reportable but that will go up with Twitter and location information. “So a lot of them don’t get into individual sentiment analysis around this,” says Sally Church, EVP of Icarus Consultants. Those companies that do work for pharma around mining sentiment have to go the aggregate route – if the pharma company itself doesn’t see the individual tweets, and the individual sentiments in them, there’s no problem. “You have to report it in the aggregate because these kinds of things are a little tricky.”

mediabistro.com event

Smartphone Games Summit

The Smartphone Games Summit is a one-day conference focused on the emerging smartphone games space! Be there on September 24 as industry leaders including the CEOs of Aurora Feint, Kongregate, and Greystripe provide insight on what's signal and what's noise in this space. See the complete program with speakers.

Email This Post

Fill out the following information and click on the Send button in order to send this post, Everything I Need To Know About Sentiment Analysis..., to a friend.
Friend's name
Friend's email address
Your name
Your email address
Note to your friend (optional, max 200 Characters)

Read more on Semantic Web >

The Voice of Semantic Web Business
Semantic Web in Your Inbox
Mobile Version
RSS Feed

Job Listings

Featured Listings

Managing Editor
Chicago B2B Media Company
Chicago, IL

Account Executive, Advertising Sales West Coast
BlogHer.com
Los Angeles Area, CA

The Huffington Post: Politics Editor, DC Office
The Huffington Post
Washington, DC


WebMediaBrands
mediabistro learnnetwork freelanceconnect SemanticWeb
Jobs | Events | News
Copyright 2010 WebMediaBrands Inc. All rights reserved.
Advertise | Terms of Use | Privacy Policy