Joining the dots
of online data

Everything we do online leaves a trace, contributing to a personal data footprint beyond our control. I developed a tool to help people explore the risks of sharing personal information online, supported by the EPSR.

Examples of unintended outcomes from sharing personal data online

Context

Some of our online traces are intentional - things like blogs, comments, photos, or likes. Many more are created unwittingly - from metadata to location histories, financial details, private addresses, family connections, and political views. Proprietary algorithms mean we do not know how our data will be interpreted. Changing social contexts mean that once-benign online statements can become problematic.

Our personal data footprint grows every moment and remains beyond our control and our understanding. When this creates problems, the consequences can be far-reaching.

Project overview

Designed an online activity to help people explore the risks around sharing personal data online
Facilitated workshops with participants from a wide range of ages and backgrounds
Thematically coded workshop transcripts
Presentations, updates, and next-step planning for stakeholders
Published papers on human-data interaction at CHI Yokahama and contributed to papers written by project colleagues

My role

I joined this project after the underlying research was already completed. It revealed that people felt uneasy about how their data might be (mis)used, but they struggled to understand how innocuous online behaviour could translate to real-world risks. This was problematic because we felt that we were articulating a significant risk that people did not recognise as such.

I created an online game to help people better understand and explore these risks.

The game invited participants to explore a single day in the data footprint of a fictional character called Alex Smith. Each data pack contained several sources for participants to explore and critique

A day in the life of Alex Smith - a persona-based game

There were obvious ethical issues with asking participants to share their own online histories. Instead, I created a fictional character named “Alex Smith” and imagined Alex’s online footprint over a single day.

Participants explored ‘packs’ of data from different sources - beginning with Alex’s own social media posts, then choosing between other sources including location tracking services, Alex’s friends and family, and biometric data.

As participants accumulated more knowledge about Alex, seemingly innocuous data sources could be combined to reveal much more than Alex intended - including their home address, political associations, finances, health, and relationships.

Participants cross-referenced data sources to find personal safety risks such as Alex’s home address and holiday plans

Adding ambiguity into the game

I tried to make each data source ambiguous to encourage participants to go beyond a surface-level interpretation and instead read into Alex’s online persona. Through doing so, they reflected on their own approach to online information sharing, gave examples of problems they encountered in the past, and discussed coping strategies to better manage online risks.

Discussing Alex’s online behaviour - here, a tweet suggesting they felt burnt out on their first day of a new job - helped participants to reflect on how they manage their own data footprints

Insights

Participant feedback was excellent. They found the game very engaging and felt much more informed about online risks after playing. Each session was scheduled for 1 hour but typically ran far longer than this, with participants asking to see every data source and even returning to continue the game after the session. This may have been as much a result of social distancing than the depth of the game itself.

I coded these lengthy workshop transcripts and shared them with the rest of the project team. There were several significant themes.

Participants were adept at identifying overt risks to personal safety and employer reputations.

They also identified risks I hadn’t considered, and spoke in depth about the nature of online trust and intrusion. The degree of ambiguity and personal interpretation in the game seemed to enable them to make and defend conclusions about Alex as a person, while recognising that had only partial information.

Awareness of potential risks didn’t always translate into taking action to mitigate them.

Where it was possible to make their data more private, most participants took steps to do so while recognising that it was, at best, a partial solution. The panopticon of algorithmic surveillance was seen as the price of participating on social platforms and could only be avoided by opting out entirely.

There was a pervasive fear of context collapse.

This happens when an unintended audience interprets our actions in ways we didn’t expect. This was closely associated with changing contexts over time. For example, an employer seeing an old Facebook photo of a drunken night out was seen as a more immediate and consequential risk than their data being misused in a more abstract sense.

Some participants took steps to confuse their data footprint

These included alt accounts, self-censorship, and fake information. These approaches were more common among participants who worked in public-facing and safeguarding jobs, where the negative consequences of context collapse are elevated. Younger participants strongly preferred platforms where content was deliberately ephermeral rather than persistent (while recognising that their data still existed “somewhere”).

Impact

Ultimately, there was a consensus among participants that the digital synopticon of peer surveillance (as discussed by Sacha Molitorisz) was of greater concern than algorithmic data harvesting. This was an unexpected outcome and extremely useful for the direction of the project.

Before starting this research, we expected that people wanted some kind of tool that allowed them to scrub their data footprint entirely. This research showed that people didn’t necessarily want to opt-out entirely in this way, but instead wanted to be better informed about risks and have more agile tools at their disposal. A digital privacy companion, for example, could assess personal, reputational, and social risks and offer ways to mitigate them before any data is shared. This approach could help people to make more nuanced decisions about what to share and in which contexts to do so.

Limitations

Although I was pleased with the process and outcome of this project, on reflection there were several ways to improve the exercise. Further value might be gained by:

Including multiple voices and more content. As I designed the game, it inevitably reflected my own views towards social media and online information sharing. Having multiple authors could mitigate this and widen the scope of the game to address more data-sharing scenarios.
Allowing people to play the game however they want. More insights could be gained if the game was unmoderated, multi-player, or a real-world rather than online exercise.
Spending more time considering future risks, such as biometric data leaks. Although I included these to a small extent, the scope of the project required greater focus on immediate risks. There is clearly scope for a speculative design project to help us better understand the potential issues around control and ownership of our biometric data.

Joining the dotsof online data