The Data Story

Focus. Carefully move the atom veil to the targeted area... and release! You got a new high score! And you just contributed to the research of quantum physics! But how? Have you ever wondered what data exactly ends up in ScienceAtHome database?

Data collection is a huge part of every game development and even more so for games in the citizen science genre. To better understand the challenges of managing data, we sat down with a part of our Developers’ team: Lars, Kristian, Anders and Birk. Birk works on improving the data infrastructure and development workflows at ScienceAtHome. Kristian and Anders are our main Unity Developers and Lars is ScienceAtHome Development Manager. Together, these guys work tirelessly to deliver both the great game experience and data to our scientists.

From left to right: Kristian, Anders, Lars and Birk.

From left to right: Kristian, Anders, Lars and Birk.

Before we get into the details, why is data collection so important to our project?

Birk: From the science point of view, the data we collect from our players plays a crucial role in coming up with new solutions.

What is common to many scientific projects is that they try to verify or reject a hypothesis. In some cases, a hypothesis can be proven using chalk on a blackboard. In other, we use modern computer simulations and repeated experiments, which can often help us, scientists, to support existing ideas. However, creating the same setup in various ways is often not enough to develop novel ideas. That is why we turn to the community for a helping hand because we have seen, over and over again, that letting everyone interact with our systems can provide new valuable insights.

Anders: Citizen science games require much more data than regular games. Most of the regular games are created with the purpose of bringing joy and entertainment to people (as well as the seemingly unavoidable ads); citizen science games, on the other hand, have a different agenda: crowdsourcing scientific research. For that reason, citizen science games like Zooniverse, FoldIt and Quantum Moves relies heavily on data collection.

What is more, data collection is a huge part of the game development too. Telemetrics is a common term in the game development industry used to describe the collection of user information, which can be further analyzed to find patterns in players’ behaviors. These patterns can reveal anything from errors occurring in the game to challenges and levels proving too difficult to complete to players playing the game in unintentional ways the developers had not foreseen.


So, what KIND of data are we collecting?

Kristian: Working with scientists, data collection evolves from optimizing player experiences and expanding interesting gameplay to highly detailed replications of player behavior. Asking our in-house data-munchers what to record, we have come to expect a growling "EVERYTHING" as the answer. Everything is a lot.

Anders: In fact, you could say we are tracking two kinds of data: data necessary for our game development and data for the citizen science.

For the first part, we use a tool called Unity Analytics, which is provided within the game engine we use (Unity3D). The information we gather from Unity Analytics is completely anonymous and essentially collects and presents statistics of the kind of device and operating system our players use. This gives us an idea how far we can push the visual quality and animations of the game and what limitations we might have to expect.

Next, we add to the game what is termed event calls. These are messages we sent back whenever a certain event has been triggered, for example, a level has been completed. This function allows us to estimate how far the players have made it in the game. For instance, if we see a sudden drop, it is a sign that the level is too difficult to complete and therefore needs to be reworked.

Using event calls for buttons helps us to understand the player’s flow in the game and potentially show confusion with the interface. For example, if a player spends too much time or switches back and forth between particular menus.

The last part of regular games information is the score. Whenever a solution is delivered within the game, the fidelity, the quality of the solution and the time used to complete it is being calculated to determine the score.

Now the citizen science part of the game consist of a constant tracking of player’s movement during the play time. The researchers are interested in recreating the exact movements the players performs whenever they deliver a near perfect solution. Therefore, every 0.02 seconds the position of the player's finger is being tracked along with a time stamp.

Another important set of data is the cognitive behavior of player’s learning patterns, which gives us a better understanding of peoples' intuitive learning. For instance, failing a QM level several times until one reaches the acceptable threshold is very interesting for our scientists and for that reason we encourage our users to log in when playing our games. Only by linking individual users to their solutions do we see a learning pattern emerge.

By the end of the day, all of our data is collected in the pursuit of two things: improving the quality and user experience of our games, as well as pushing the boundaries of science. This requires not only the creativity and hard work of the developers and scientists but also the immense contribution from our players.

Now, that we got a rough understanding of "what" and "why", it is time to invite our readers to explore the "how" part. Stay tuned for our next blog post, which will reveal the path of your generated data from your mobile phone to our lab and more!


The Data Story