Everything I Need To Know About Sentiment Analysis...
● Conference organizer Seth Grimes explained that two humans will only agree on expressed sentiment in 82 percent of cases. Expect machines to do better? No, and yes. Estimates of sentiment analysis accuracy can range from 70 to above 90 percent. Seventy percent wouldn’t have been high enough to help global public relations firm Porter Novelli help one client conclude that it shouldn’t walk away from its multi-million dollar investment in Tiger Woods as a spokesperson post-scandal, says Brad McCormick, Executive Vice President and Director of Digital at the company. But getting to those 90 percent plus levels means you need a huge volume of data – about 150 on-topic conversations per day every day – and rarely is there that level of conversation around most topics (Ed note sex, drugs and rock and roll excepted). Other observations were that to get much beyond 80 percent accuracy, especially when the content set is broad, likely requires a huge investment and in most cases returns are diminished. The point isn’t to get 100 percent accuracy so much as it is to identify outliers, says Greg Radner, global head of PR services, Thomson Reuters. “I don’t think it’s worth the effort to go beyond 80 percent because this is directional anyway,” he noted. Bradley Honan, senior vice president of StrategyOne, isn’t bothered by a 70 percent accurate take on sentiment – “if we figure 30 to 40 percent of people are actively engaged putting content online using social media, then we’re talking about 70 percent of that 30 or 40 percent,” and 70 percent of a small pie of what are probably the most partisan (positive or negative) individuals anyway is pretty good. As a method of capturing the sentiment of those to the right and left on the intensity curve, whose conversations can reverberate and influence, it’s a good arrow in a quiver that also has to include surveys, polling and so on to really understand the customer base, he said. ● What the heck do accuracy percentages mean anyway? From the audience came questions of how to interpret these claims better. If I understand Claire Cardie of Cornell University, it’s about precision, recall and F measure for document sentiment analytics systems. If sentiment analytics technology gets it right on positive, negative or neutral on 3 of 4 documents, its precision rates .75. And that can change based on the number of documents it tries to label – for instance, if it leaves out that one it mistook its precision goes up, so you can get a high precision level if it analyzes only for what it is ery certain about. Recall is the number of correct answers divided by the number of documents that have to be labeled – no skipping allowed – so the recall in this case is 3 out of 4. The F measure is the average of precision and recall but gives higher scores when precision and recall results are closer together. And how well this all works depends on what is being analyzed, anyway (news is harder to analyze than movie or product reviews, for instance, and shorter documents harder than longer ones.) So, to take us back to where we began with all this, a 75 to 85 percent F measure would be considered good in research, “because the comparison is not to a system getting 100 percent precision or recall but compared to the alternative of a person making that decision on each document.” ● SAP and SAS have the reach to be well-positioned to help close the loop internally between what you can do with sentiment analytics around internal structured and external unstructured data. (The point was made offline by a gentleman from a smaller startup.) For instance, Bernard Chung, Director of CRM Product Marketing, SAP Labs, LLC, SAP, showed how sentiment analytics can be used to enhance enterprise business processes. The vision is around marrying traditional enterprise customer data an organization may have with the social media over which customers express their feelings to gain a holistic picture of the customer, and then extend traditional business processes to incorporate social media with the customer interaction strategy to optimize their experience. He showed SAP’s Twitter service solution for service managers –using the Twitter API service managers could pull in relevant tweets, supported by SAP’s Business Objects’ text analytics solution, filter them by sentiment scores, create a service ticket for customers having problems from the enterprise environment and also send a twitter direct message to the customer that SAP is aware of and will be addressing the issue, including the customer service ticket number. ● Lot of attention, obviously, on sentiment analysis in Tweets – not surprising, given how open Twitter is with the data and how easy it is to collect vs. blogs and other news articles that have to be scraped and stripped. The audience got to see a young Stanford University’s student Twitter Sentiment solution (in an interesting if a bit of an inexperienced showing) and a quick and more polished look at Tweetsentiments.com, an NLP and machine learning based solution available in a full international edition that in real time can calculate sentiments by country, as well as particular user or topic, and comes with an API for integrating into your own application. Email This Post |
The Voice of Semantic Web Business
|
|||||||