About a year ago I wrote about the idea of using split-testing to refine the user experience in online games (it’s not really suitable for stand-alone retail games). Today I caught a brief article on Redline China talking about how the Chinese online game giant Zhengtu is using a very simple version of split-testing for its newest online game - Juren. (For those who aren’t familiar, Zhengtu’s biggest game - Zhengtu Online - is, I believe, the #2 MMO in China, behind Fantasy Westward Journey. Don’t feel too bad if you’re a Westerner and haven’t heard of them.)
Zhengtu is running two version of Juren during its beta test, and will launch the game with the more popular one, along with selected favorite features from the other version. It’s the most basic, single-pass implementation of split-testing that one could implement but it’s an interesting step in the right direction. What I’d love to see would be a multi-step process, since split-testing works best when you can fairly quickly pick a ‘winner’ and a ‘loser’ among two options, and then take the winner and put it up against another version, rinse, and repeat.
Figuring out how to do multi-stage split-testing in games is kind of a holy grail in terms of user acquisition I feel like. It’s hard, because it requires real, live users and because altering games to set up new cases for split-testing is a lot more work than doing so with ad copy (where split-testing originated), but think of the potential for tuning your newbie experience when you can, in a matter of a day, objectively measure what is the stickier experience, then take the winning experience and pit it against a new variation, etc. It’s actually kind of ideally suited for text MUDs given the relatively low content creation costs except for the fact that text MUD populations are not big enough to split in half without major social costs (though one could possibly separate off the newbie experience and then route people into “the game” once they’re done with the newbie stage).
6 comments
Comments feed for this article
October 10th, 2007 at 7:05 am
Pingback from Plaguelands » Blog Archive » A new take on MMOG Development
October 9th, 2007 at 3:18 pm
wowpanda
There is a chance that the smaller half might be pissed that the stuff they are familiar with is gone in the official lunch, and off creating private servers.
And sometimes most popular version might not be the best one. For example I started playing WOW because my friend from college days are playing it, and I didn’t compare any other games with it.
Another thing will be, not everything is 50/50. There might be too many choices to do split testing (2^N), so they might end up discarding some of the good stuff in the discarded version, and adding them back upon user feed back.
October 9th, 2007 at 3:23 pm
Matt
Yep, those are all potential problems with just doing one pass at it. Split-testing is most effective when you can do it over and over and over.
October 9th, 2007 at 9:51 pm
BobSugar
On the project I’m working on, we’ve been doing a lot of user-interface split-testing, to determine the most accessible and fun controls for the game. I have noticed one very dangerous problem with split-testing (and any form of focus-testing, for that matter): it accords an unreasonably high weight to the initial experience.
We’ve been trying to use your example of quickly picking a ‘winner,’ and then quickly iterating to the next split-test, but this heads pretty rapidly towards an easily accessible, very shallow gameplay experience. Choices which aren’t easily understandable to players are not only often ignored, but are sometimes disliked initially.
I’ve been playing alot of Halo 3 lately, and when I first started, I couldn’t see any perceivable strategic difference between the assault rifle, the submachine gun, and the spiker. In fact, if presented with all 3 in a focus-test, I probably would have commented that they were unnecessarily overlapping, and one or two should be cut. I think that’s a particularly dangerous sort of example, because not only are your testers not going to grasp the deeper gameplay features initially, but they might even dislike them (or prefer a shallower version you also present), giving you poor feedback.
The only way to avoid this is to increase the time spent under each split-test, which unfortunately fights the quick iteration ideal.
October 9th, 2007 at 9:55 pm
Matt
Yep, the reason that I specifically mentioned split-testing for a newbie experience in an online game is because you have a very measurable behavior: How long a player engages for. You don’t need to (and shouldn’t) ask them what they like because you’re looking for objective data. Does a change result in a higher stick rate for newbies? Then it’s almost certainly the right choice.
When it comes to the game experience as a whole, I agree it’d be very hard to do.
–matt
October 9th, 2007 at 10:08 pm
BobSugar
Yeah - we’re finding almost the exact same thing. Split-testing and focus-testing is extremely useful for streamlining the initial experience (and we’re still using it for that) - but ranges from useless to actively harming the process when applied to anything further down the gameplay learning-curve.