How to develop a data project without panicking / losing your mind
Tips for companies and organizations eager to work on their data
Data tells new things about us. Sometimes they might be witty and funny facts, but they say a lot. For example, this interactive reveals which neighborhoods of Buenos Aires are pretending to be others. Other projectswith data are more serious, institutional: they arise from government themselves, that assume the responsibility for opening their data so that citizens can closely keep track of their work. This is the case of the Government Commitments project launched by the City of Buenos Aires at the beginning of 2016, and yes, produced by us at Sociopublico.
If you are an NGO, an international organization, a company, a government or just someone with access to an interesting amount of data, anyone can start a data project. Okay, so how to start? We’ll share four lessons we learned working with data so you don’t panic at the first attempt.
Key #1: “Let go”. On how to build a dataset
As Marie Kondo says, order is magical. Letting go of what is not needed in a database will allow us to have a clear and ordered dataset and prevent us from wasting time when trying and using different tools to visualize data. It will also reduce the margin for inexplicable errors ahead.
- First things first: Data alone is not a dataset. The first task is to convert it into a dataset, unifying the type of file. For example, transform and transfer all data into an excel spreadsheet format.
- But pay attention! Not any Excel file is a dataset. A dataset only has data: it does not look cute, and most likely it is not understood at first glance (it is made to be understood by an application or program, not humans). In a dataset there are no colors, no borders, no merged cells.
- It is very important to separate the information, and split any data that is not necessarily related. For example, instead of having two tables in the same Excel file, it is better to create a new file or a new sheet for each table.
- Now, this is between you and the table. Take out any extra metadata or info that is not directly related to the table. If possible, even remove titles and sources, you can save them on another sheet.
- Last tip: make it simple. Review everything again, maybe it is possible that one column can be split into two different data columns. For example, if the same column includes the sex and age of a person, it is best to separate that data into two columns so that they can be processed and crossed independently.
2. Go beyond the numbers
Some data is quite complex to understand, but there is always a way to make data interesting and understandable for someone who doesn’t know anything about an issue.
One way is to tell stories born within the data.
How do you find these stories? It’s useful to have a talk with someone who doesn’t know anything about the subject but is interested; tell him what we researched, what we know, what the data is about. This mix between the informed perspective and the one that’s maybe naive but interested, usually brings up findings and insights that can easily go unnoticed by someone that is too familiar with the data.
For each goal of the Government Commitments project, we matched the progress with a text that explains what the commitment is about, why it is important and what will be done to achieve it. In addition, we included videos, animated gifs and infographics to add context, so that you can understand what’s most important of the commitment at a glance.
3. Many tools available to visualize data
A few years back, only specialized professionals could use tools to create a nice visualization, but now, new and easy-to-use tools appear every day. Today it is possible to produce a dataviz piece quickly and easily. Infogr.am or the free versions of Tableau and Highcharts are great allies to create our own visualizations. In most cases, it’s just about uploading one of the datasets we prepare and choosing the best way to display them. While this has lots of advantages, it does come with some disadvantages.
4. Open data
Opening the data is, without a doubt, the part that generates more resistance and is most conflictive, although it’s also the most rewarding one.
Open data means publicly offering the datasets we generate and have so anyone can download them, analyze them and use them in any way they want. For governments, this is increasingly an obligation. Exposing the details of the government management increases its transparency before its citizens, it enables them to verify if what is said matches what is done and, thus, reducesthe probability of corruption. But you cannot open data in any way. There are standards that data must meet to be considered truly open. For example, it is not the same to upload information in a pdf, a format that is almost an image and cannot be copied or pasted, than to upload that same information sorted in a table and in an open format file (.csvs or .xml), a friendly format for data processors. In the case of the Government Commitments of the City of Buenos Aires, each graphic is accompanied by its dataset, which can be downloaded in .csv.
For the rest of us mortals, individuals, companies or organizations, opening data is not an obligation, but it can be a very smart way to relate to the world and put the data to work.
By Ana Soffietto. Illustrated by Marcelo Morán.
Remind me later: the online buttons saga
In the digital world, the opposite of ‘yes’ is not ‘no’, but things like: ‘remind me later’, ‘not now’, ‘ask me again later’, or ‘I don’t want my benefits’.Learn more
What we can learn from coronavirus to improve remote working
Unexpected global circumstances force us to use new tools and learn new practices to adapt to telework. We compiled expert advice to navigate this ‘free sample’ of the future of work.Learn more
From OOO to OHH! Tools to improve the out of office message
Even though we should have learned throughout the holiday season to distrust each “new message” notification, they still get our interest. This time of the year is terrible. So many times, instead of the expected reply to our message, we receive automatic emails with the subject: “out of the office”.Learn more
Marikondoing at work: the life-changing magic of ordering files
Some practical tips to organize our workflow when we share files with other peopleLearn more
Green eggs and ham
While we wait to see if and how our habits will change, the peak of experimentation is providing algorithms with great insightsLearn more
TikTok, social media challenges and live streaming: communication strategies during the pandemic
There is only one correct way to wash your hands, but there are a thousand ways to communicate it. There are also plenty of ways to stay home but not feel lonely, thanks to creativity in social media. And there are new territories to explore, like TikTok.Learn more
Branding Covid-19: a name, a logo and a slogan
Language is changing to capture what we are going through (and we are going through a lot)Learn more
Data as a source for creativityLearn more
Websites are like dinosaurs, only they’re alive!
How we built the new website for the Argentine think tank CIPPEC, step by stepLearn more
Sociopúblico is a full-service strategy and communications agency for complex ideas.
Sociopúblico is a full-service strategy and communications agency for complex ideas. We work with teams all over the world from Buenos Aires, Montevideo and Washington DC.Learn more