So You Want to Be a Data Scientist?

Nov 01, 2015
So You Want to Be a Data Scientist?

Okay, I admit it: I read Secret. One post that literally made me laugh out loud recently said, "A data scientist is a data analyst who lives in Silicon Valley." (The others are unprintable; sorry).

That comment gets to the heart of the confusion surrounding big data and data science, And it is a strong signal of a disruptive trend: it makes us uncomfortable, and it makes us question whether we're focused on the right things.

A similar dynamic led Jeremiah Owyang to write "The Career Path of the Corporate Social Strategist" back in November of 2010. The thesis of that report was that social media was changing business to the point where the strategist had two choices: either become a leader and overcome organizational inertia, or risk becoming a social media "help desk" for the company.

This same dynamic exists today in the field of data science. The proliferation of high-volume, high-velocity and heterogeneous data sets is creating unprecedented strain in organizations: to process the data, to interpret it, and, of course, to act on it. But the lack of standards, immature technologies, and disparate data types and sources make this much more challenging than anything that's come before. You either have to establish leadership, or risk living the rest of your life running gazillions of urgent but ultimately meaningless ad-hoc reports.

Social data is one of the great unsung heroes (or villains) of this drama: it's a mix of structured, unstructured and semi-structured data. It resists and in some languages defies sentiment analysis. It can take the form of social actions (likes, shares, +1s, comments, images, sound files or video). It can have the lifespan of a mosquito or it can live more or less forever. It's coming at you in real time and is outside your control.

All of this creates huge challenges for those in--or aspiring to be in--the field of data science. And it also creates huge challenges for organizations: if you have multiple versions of the truth, how do you know what to believe?

So, in the spirit of moving this conversation forward, here are some suggestions I've gleaned from my conversations with analysts, data scientists, marketers, strategists, engineers, executives and others working with big data today:

Know what you're solving for (but don't let it rule you). Understand your internal client's objective. It's obvious, but hard to do. Do they want to see revenue impact? Bottom-line impact? Reputation impact? Unknown patterns? How are they evaluated? Ask those questions to make it easier for you both to find hidden insights in the data, and set expectations about what you are able to find, and what you're not.

Think broadly and be curious. It's critical to have an expansive view of the business and a holistic view of your organization's data, even if you can't possibly integrate it all. Are you a web analyst? Ask what's happening with social data. A social data analyst? How does it compare to market research data? Triangulation helps uncover previously-unseen patterns.

Have a bias for action: Counting in the absence of analysis is a sign you don't understand the problem well enough (see #1, above). Always, always strive to answer the "so what" test. We increased views by five percent. So what? Does that drive revenue? Sharing behavior? Advertising dollars? Did it enable us to reach new audiences? If you can't answer the "so what?" test, you're counting, not analyzing.

Remember that you too are biased (oh yes you are). We learn from experiences and we apply them to our next experiences. Do everything you can to disprove your findings. What assumptions formed your hypothesis? Are they provable? Did sales go up because viewers loved your fabulous content? Or do viewers who are already loyal tend to buy more anyway? The NPR Code Switch blog is a fantastic resource to broaden your thinking. And for heaven's sake, be transparent. Disclose data sources, sample sizes, methodology, demographics to the extent you can.

Tell a story. Not everyone speaks data, and the more data-centric we become, the more important that the left- and-right-brained learn to speak a similar language. Left brainers, tell the story of how you came to your conclusions. Right-brainers, I highly recommend some education in statistics to help you become more fluent. Everyone, use visualization to help illustrate your data. And, in doing so, remember point #3.

Of course, I'm focusing on mindset issues here; there are plenty of tactical recommendations for what communities to join and tools to learn. But before you escape back into your comfort zone, consider this: to be successful in this brave new industry, you have to define it--before it defines you.

Featured Jobs

Government Statistical Service

Bristol, Darlington, Fareham, Leeds, London, Newport (Gwent), Salford, Warrington

August 06, 2024

British Airways

West Drayton, UK

August 14, 2024


Blackfriars, London – Hybrid Working

July 29, 2024

Frontier Economics

London Brussels Berlin Madrid Paris

August 01, 2024


London, UK

August 21, 2024

Building Digital UK (BDUK)

Darlington, Edinburgh, London, Manchester

August 04, 2024

Our Partners

Logo for Bank Of England
Logo for Cma
Logo for Frontier
Logo for Logo
Logo for Amazon
Logo for Dstl

Like what you see?

Post a job