Michael Wu, Ph.D. is Lithium's Principal Scientist of Analytics, digging into the complex dynamics of social interaction and group behavior in online communities and social networks.
Michael was voted a 2010 Influential Leader by CRM Magazine for his work on predictive social analytics and its application to Social CRM.He's a regular blogger on the Lithosphere's Building Community blog and previously wrote in the Analytic Science blog. You can follow him on Twitter or Google+.
Hello and welcome back. It’s been 3 weeks since I last blogged. I’ve been very busy with the other half of my life (i.e. the engineer/architect side)--busy developing a framework for social analytics and doing some architecture work for that project. And I’m still working on it, so this will be a short piece.
The Engagement Side of My Work
As the end of the year approaches, I’m reflecting on my blogging and the more external-facing side of my work (the side of my life that most of you know). Come to think of it, I’ve only been blogging for about 2 and half years and although I’d written a few blog posts earlier, I never had my own blog until about 2.5 years ago. My first official blog article (Introducing Analytic Science at Lithium) was published in May 4th, 2009. Although I don’t think of myself as very prolific, I have produced a modest 87 articles + 376 comments, and received 467 kudos (see the statistics in my profile) over the past 2.5 years.
Last time I wrapped-up Chapter 1 on Gamification. So 3 chapters have now been complied:
The computation: Compute the edge weight based on the communication frequency between each pair of tweeters, and the eigenvector centrality for each tweeter based on the communication graph
The result: see figure below
This visualization is intuitive and beautiful (and yes, as a data scientist, I think patterns in data are very beautiful). I mapped the communication frequency to the color and width of the edges: Thicker red edges represent higher frequency of mentions and retweets. I mapped the size of the avatar to the eigenvector centrality (which is a measure of authority in a network much like Google’s PageRank): The bigger avatar, the more authoritative the tweeterwith respect to the webcast (#LithCast). Pretty intuitive, right?
Now, when interpreting this graph, it’s important to put the data in context. Does a bigger avatar mean more influential? Yes, but only with respect to this particular webcast. Can we say anything about the big avatar’s influence on other topics? For example, can we conclude that Paul Greenberg is an influencer in social CRM? The answer is not based on the available data. Although we know Paul is an influencer in social CRM, we cannot make that conclusion based on these data. If the raw data is all the information we have, then we simply don’t know if Paul is influential on social CRM, because we did not explicitly collect any Twitter data on social CRM. The only reason we know Paul is a social CRM influencer is because we have other information beyond the raw data I collected for this analysis.
One must be very careful when making conclusions based on data. They’d better be just based on the data! Alright, I know someone will probably ask what tool(s) I used to produce this graph.... It’s produced by NodeXL created by Marc Smith, the Chief Social Scientist of Connected Action Consulting Group and an old friend of mine.
OK, told you this will be a short piece. Stay tuned for more data analysis blog. And let me know if you like the idea of putting more emphasis on data and analysis for 2012. And if there is a topic that you like me to cover, feel free to let me know as well. Discussions are always welcome here. See you next time.