6 hours ago "Exploring interactions of organizations, individuals and ideas on the outer edge of the enterprise."
- Lithosphere
- »
- Blogs
- »
- Enterprise on the Surface
- Mark all as New
- Mark all as Read
- Subscribe
- Bookmark
- Subscribe to RSS Feed
- Invite a Friend
Displaying articles for: March 2009
We talk a lot here about the importance of finding and engaging your super users to drive community success. But it's always nice to see it in action on one of customer's communities - like our friends at Verizon, who just today issued a press release about the success of their
Lithium powered online Verizon Community
Forum in enhancing their overall customer experience, as well as providing an excellent resource to help Verizon improve their product offerings. From the release:
"According to Mark Studness, director of e-commerce at Verizon, the Community Forums have been well-received since rolling out last July, generating more than 10 million page views.
'The Community Forums have spurred interaction among customers because people today expect to be able to find answers to their technical questions online,' said Studness. 'The feedback we've already received shows that our customers value the personalized peer-to-peer advice and feedback they receive from fellow users.'”
Aside from Mr. Studness' super cool name, 10 million pageviews for a new community in 8 months and growing is an excellent metric. And if you read the full release (also up on the Lithium site), there was an excellent profile of one of their community's super users, Justin and what keeps him coming back. But I thought I'd take it a one step further on this blog to see what kind of content people are getting with all those pageviews:
- 19127 posts in the community since it's launch. That's about 80 posts a day for 8 months of consistent posting - in fact, today's rates of content generation are probably much higher as the community has grown.
- 1309 posts were from the top two members on the Kudos leaderboard for the community, Justin (522 posts, named in the press release) and TimSykes (787 posts) combined for an amazing 7% of the total posts in the community. It's even more amazing when you realize that TimSykes didn't even register until October.
- 318 posts marked as accepted solutions. This is one of my favorite numbers, a clear indication of value of the content being created on the community, and also a conservative metric because accepted solution rates are typically an order of magnitude lower than the number of answered questions in the community.
By finding and engaging those all-important super-users, Verizon's community is flourishing. Which helps to explain why Verizon is seeing significant operational efficiency and cost savings as a result of deflected calls and a growing knowledge base of peer content.
Do you have a super user story you'd like to share? Let me know!
photo by inju
We've had a lot of math on my blog recently, so I thought I'd take a break and talk about some of the more touchy-feely aspects of community today.
Do you love your customers? What makes you think they don't love you?
Ever since the Cluetrain, a lot has been said about the new power people have to be heard with social media. However, it seems that most companies believe customers will use this power to do them evil rather than good. After all, the #1 concern I hear from customers considering building a community is some version of "how do we keep people from saying bad things about us on our site"?
I find it equally odd that the common retort I hear is a riff on "Well, they're going to say bad things about you anyway, so why not let them say it where you can see them?"
Why are we so convinced that our customers hate us? Is this what all those customer surveys, Net Promoter scores and market research have told us over the years? Are there hundreds or thousands of people who have been just chomping at the bit for us to open our doors so they can yell at us? Then why in the world is anybody actually buying our products, much less buying them again and again?
I think this crisis of organizational self-confidence needs a quick dose of Jack Handy: "I'm good enough, I'm smart enough, and gosh darn it - people like me!"
I'm treating the issue kind of lightly here, but there does seem to be a lot of irrationality with how companies perceive online conversations with customers. Perhaps it stems from the venerable old adages that 'no news is good news', and 'you only hear from people when things go wrong'. We deal with so many fires and issues in our daily lives isolated from customers inside our corporate brand, that we think that is all there is. But unless you are a monopoly or fascist state, the reason you are still in business is that customers generally think they get good value for your products and services. When companies actually do invite their customers to tell them what they think, they are often pleasantly surprised by the quality of the responses they receive.
I'm not saying people won't complain about your products and services in your blogs or forums, or that online attacks on your brand don't happen. But I am saying that they happen a lot less than companies expect, and careful preparation in advance will both prevent the worst, and enable you to respond quickly to address issues before they become crises (for a quick primer in preparing for negativity in public see this exerpt from Andy Sernovitz’ book, Word of Mouth Marketing: How Smart Companies Get People Talking).
Think for a moment of the brands you use in your own life. What would you like to say to them if they asked? What would you tell your peers about them?
Now think of the brands you despise. Is a part of your anger their unwillingness to listen respectfully to your needs? If they actually did pay attention, would that soften your opinion?
Photo by aussiegall
Welcome back once more to Michael Wu, here for the penultimate installment in his series describing how the new Community Health Index was developed:
This is my fourth blog in the series that describe the development of the community health index. Previous blog posts can be found here:
- From the Brain to Community Analytics
- Criteria for Creating the Community Health Index
- Crunching Numbers for the Community Health Index
Last time, I crunched some numbers and talked about some of the mathematical challenges that I have overcome. Now, it is time to interpret the results.
Running the regression analysis is the easy part. Although it is fairly technical to set up the nonlinear regression equation, it is mechanical in the sense that anyone with background in math and statistics can do it. The remaining part of the analysis involves interpreting the results to derive meaning and insights. This is often the most challenging aspect of any statistical analysis because it is more an art than a science; yet it must have all the rigor, objectivity and accuracy of science. For example, I would have to decide which predictor variable to remove among those with similar predictive power. When a set of variables is found not predictive, is it a failure of the model to harness their predictive power or is it the case that these variables are truly independent of the response, in this case health. Interpretability of the final model becomes important, and looking at numbers alone is no longer sufficient. In statistics this process is call variable selection.
After eliminating the predictor variables that are not consistently predictive of health, we have only answered the question of which variables are predictive. But we still don't know how these variables are predicting health. For example, suppose we know that post count is predictive of health; will the health level increase by 10% if the post count is increased by 10%? Or will the health level increase by 30% if we observe a 10% increase in post count? Or perhaps, the health level depend more strongly on post initially, but become less dependent as the post count increases. To answer these questions, we must analyze the nonlinear relationship between the variables that we decide to keep. Not to complicate things, but it is often necessary to repeat the process of variable selection and nonlinear analysis for different subsets of variables, different nonlinearity, and perform them in different orders.
We are almost done! Next week we'll bring this all together into the new Community Health Index! If you have any questions I'd be more than happy to address them in the comments, or feel free to ask me on Twitter at mich8elwu.
Photo by Thomas Claveirole
Welcome back Michael Wu! Here is his third installment in a series describing how the new Community Health Index was developed:
To begin the analysis of the previously collected data set, I gathered the non-metric data from various sources by talking to the moderators, the customer success managers (CSM), and our best practice advocates, which included Joe Cothrel and his team. As I mentioned earlier, these data are extremely important because they serve as the ground truth to our prediction problem. It is through the eyes of the moderators and the CSM who monitor and interact with the community everyday that we know how healthy a community is. Tabulating these non-metric data gives us a time series of the health level for each community. Since all the recorded metric are already in the forms of a time series, now we can turn to statistics and begin the number crunching.
The idea is very simple. We know the health level of the community from the non-metric data; now we simply want to know which of the 20 metrics that are commonly available can best predict community health. This can be achieved by running a sequence of linear and nonlinear regression analyses using the 20 metrics as the predictor variable and the tabulated non-metric data as the response variable.
This, however, is not trivial. Some of the issues that must be dealt with include the correlation among the predictor variables, the nonlinearity between the predictors and the response, and the nonstationarity of the time series data.
That's quite a mouthful, so here is a bit of explanation about what I mean by that:
The problem of correlations among the predictor is known as multicollinearity. If some of the predictor variables are highly correlated, it is very difficult to determine which predictor actually causes the response. Computationally, this shows up where the large regression coefficients may jump randomly between the correlated predictors. And these jumps are highly sensitive to the data making it difficult to determine which of the correlated predictors is most predictive. This is a very prominent problem in community data as many of the metrics are highly correlated. For example, if the community has a lot of traffic, they tend to gain more members, and achieve higher level of activities. I have used partial least square and boosting to try to overcome this problem.
Nonlinearity means that the predictors and the response may not be related in a linear fashion. That means a fixed changed in a predictor don't always lead to the same change in the response. It also depends on the history of the predictor as well as the interactions with other predictors. There is no out-of-the-box solution for nonlinearity. I just have to try some nonlinearity, plot the data, look at them, reformulate the model, and see which one fits and predicts best.
Finally, nonstationarity means that the system's behavior, in this case the community, depends on the absolute time. This makes prediction of any time series data very difficult. In laymen's term, it means that any statistical pattern that we have learned may change from one time to another (this is what it means by dependence on absolute time). In other words, knowing the history does not predict the future. For example, if we want to accurately predict the stock market price, any pattern we learn from the history better continue in the future. If there is a trend (or seasonality) in the history, the exact same trend (or seasonality) should persist in order for us to predict the future. If the trend changes in the future, then following the historical trend will lead to a wrong prediction. This is a very prevalent problem in communities, because communities are constantly changing due to management decision, product launch, marketing efforts, etc. There is also no way to predict a completely nonstationary system, as seen by the fact that no one can predict the stock market. We can only make some assumption about the how nonstationary our system is, proceed, and hope for the best. To deal with this problem, researchers typically assume one of several weaker forms of nonstationarity, and I have assumed the wide-sense nonstationrity in the analysis of our community data.
That is a lot to digest! If you have any questions I'd be more than happy to address them in the comments, or you feel free to ask me on Twitter at mich8elwu.
Next time: Interpreting the results!
Photo by lrargerich
Note: edited to correct a typo I added to Michael's post by mistake.
Michael Wu joins us again for the second installment describing how the new Community Health Index was developed:
I wrote previously about how I came to start the development of the Community Health Index (CHI), through my background in the science of the brain and through Lithium's extensive data set of online communities. Picking up the task, I will start by defining what it means when we talk about community health.
The performance of any enterprise communities has two dimensions:
- meeting the needs of members (customers), and
- meeting needs of the business (enterprise).
Community health addresses the first dimension, and it measures how well the community meets the needs of its member. It is very important, because without customer satisfaction, there is no business success.
With this understanding of community health, I set two basic criteria to narrow down the data we must plow through. Otherwise, the most complete picture of community health would be a consummate of all the data about the community. First, because it is our objective to make the community health index universal, we must use basic data that every community has. This eliminated many of the metric data that only Lithium keeps bringing the number down to about 20 (I actually analyze more than 20, but only about 20 are universally available). Among these are the usual metrics plus some less common ones such as percent of unanswered threads, average thread depth, average number of unique participants in a thread, average post length, etc. Although these metrics might not be recorded explicitly by every community platform, they can be easily computed from aggregating and summarizing the record of all the messages and user data that every community must have.
After establishing the initial data set, the second criterion we applied is known as the Occam's razor. The goal is to come up with a minimum set of data that gives the greatest predictive power. This is a challenging problem in statistics, known as the bias-variance tradeoff. In plain English, it means that there is a tradeoff between the complexity of the model and the predictive power of the model. Although complex models that use many variables will always have greater explanatory power for the available data, their predictive power for unseen future data degrades. On the other hand, simpler model with few variables may not explain the current data as well, but they are more predictive of future trend. Why is that? That is just the nature of uncertainty and how it works, much like why gravity always attracts.
Next time we'll start the journey through the Lithium community data set. And I'll turn the number crunching crank to identify areas with the greatest predictive power!
For updates and discussion between Michael's posts, leave your comments here or you can follow Michael on Twitter at mich8elwu.
Photo by xmatt
If you've looked into online communities in any way, chances are you've heard of 90-9-1, also called the 1% Rule of Participation Inequality. What it describes is that about 90% of visitors will rarely contribute content to online communities at all, 9% will post infrequently and a small proportion of members, the 1%, will tend to post the majority of all content in the community. At Lithium, we call that 1% the Super Users of your community.
But so what? Why does this matter so much that nearly everyone in the social media world feels compelled to talk about it?
Some folks seem to regard this as a challenge or opportunity - if we can just figure out the right magical formula, they say, we can unlock all that potential activity from those 90%ers to make our communities successful. Like turning lead into gold, but perhaps just as hard to do.
Others try to use 90-9-1 as a benchmark or average by which to measure their success, and spend a lot of time and effort raising their 'scores' one or two percentage points closer to the mark or above it. Even though studies have indicated that this ratio tends to vary by both scale and modality.
And finally there are those who seem to take it as an excuse to avoid online communities altogether, and perhaps marginalize them as the fringe that only represents the minority view. This view forgets or purposely ignores the other 90-99% who are paying attention to what's going on.
There is still a lot of work to be done to determine why 90-9-1 seems to occur over an over again and whether it can be influenced or altered in any way. But until that day, there are some ways this knowledge can actually help us to build more healthy and effective communities. Here's three things 90-9-1 means to you:
- If you want to increase quantity of activity in your community, it’s more effective to increase the total population who visit your site than to try to get current members to participate more (not that you shouldn't do both, but the former will typically be more effective than the latter).
- If you want to increase the overall quality of activity in your community, it is generally more effective to focus your efforts on those 1% who contribute the most.
- If you want to find out what the total reach is of your community, be sure to count the 90% or so who are spectators as well as the 10% who are posting.
Are you worrying about 90-9-1? Or are you using it to your advantage?
Another treat for you today: Michael Wu, resident scientist and chief number wranger behind the Community Health Index has agreed to drop by and tell the story about how this new open standard was developed. Enjoy part one of this special peek behind the scenes!
For the past six months, I have been engaged in a massive data analysis project at Lithium to develop an index that measures the health of online communities. I've subsequently refer to this index as the community health index (CHI), which I like to denote with the Greek letter Χ. This project began shortly after I joined Lithium when I received my Ph.D. at UC Berkeley in Biophysics. Although it was a dramatic transition from academic to industry, I thought that analyzing community data shouldn't be that difficult. After all, data are just numbers and the math and statistics required to gain insight from them are just equations and symbols, which are universal across all disciplines. I was in for quite a surprise.
I have been a brain scientist during my academic years, and I focused in an esoteric area called computational visual neuroscience. Basically, that just means that I use a lot of math, statistics, and techniques in physics to model, study and ultimately understand how the brain process visual information. Coming from this background, I see an obvious connection between a community and the brain: they are both complex networked dynamical systems.
- The brain is made up of approximately 100 billion neurons talking to each other through a language of their own (action potentials, which are impulses much like the Morse code).
- Each neuron also network with other neurons and form connections that create local cliques of friends and buddies.
- The interactivity between the neurons is what makes the brain (viewed as a community of neurons) work. Without these interactivities the brain will wither and die of atrophy.
Although there are many more interesting analogies between the brain and a community, now that you see the connection, it is time for the surprise. To my astonishment, Lithium actually has a huge data set spanning the 10 years of its SaaS business operation. This is compounded by the fact that Lithium keeps about 240 different metrics that monitor every moving part of the community, and the metric list is growing as new features are being added. Moreover, there are copious non-metric data. These include moderator log files, notes from customer engagement, and annotations of PR or any event related to the customer. To my surprise, it turned out that these non-metric data accumulated over the years through active community management, moderation and customer engagement are most valuable and informative for the development of the community health index.
In later posts I'll describe my journey through this large and complex data set. But today I'd like to hear from you - what do you most want to know about the Community Health Index? What next steps would you like to see?
Photo by jepoirrier
Updated to fix the CHI symbol (Χ) display.
