So You Want to Be a Data Scientist?

DataScience
Nov 01, 2015
So You Want to Be a Data Scientist?

Okay, I admit it: I read Secret. One post that literally made me laugh out loud recently said, "A data scientist is a data analyst who lives in Silicon Valley." (The others are unprintable; sorry).

That comment gets to the heart of the confusion surrounding big data and data science, And it is a strong signal of a disruptive trend: it makes us uncomfortable, and it makes us question whether we're focused on the right things.

A similar dynamic led Jeremiah Owyang to write "The Career Path of the Corporate Social Strategist" back in November of 2010. The thesis of that report was that social media was changing business to the point where the strategist had two choices: either become a leader and overcome organizational inertia, or risk becoming a social media "help desk" for the company.

This same dynamic exists today in the field of data science. The proliferation of high-volume, high-velocity and heterogeneous data sets is creating unprecedented strain in organizations: to process the data, to interpret it, and, of course, to act on it. But the lack of standards, immature technologies, and disparate data types and sources make this much more challenging than anything that's come before. You either have to establish leadership, or risk living the rest of your life running gazillions of urgent but ultimately meaningless ad-hoc reports.

Social data is one of the great unsung heroes (or villains) of this drama: it's a mix of structured, unstructured and semi-structured data. It resists and in some languages defies sentiment analysis. It can take the form of social actions (likes, shares, +1s, comments, images, sound files or video). It can have the lifespan of a mosquito or it can live more or less forever. It's coming at you in real time and is outside your control.

All of this creates huge challenges for those in--or aspiring to be in--the field of data science. And it also creates huge challenges for organizations: if you have multiple versions of the truth, how do you know what to believe?

So, in the spirit of moving this conversation forward, here are some suggestions I've gleaned from my conversations with analysts, data scientists, marketers, strategists, engineers, executives and others working with big data today:

Know what you're solving for (but don't let it rule you). Understand your internal client's objective. It's obvious, but hard to do. Do they want to see revenue impact? Bottom-line impact? Reputation impact? Unknown patterns? How are they evaluated? Ask those questions to make it easier for you both to find hidden insights in the data, and set expectations about what you are able to find, and what you're not.

Think broadly and be curious. It's critical to have an expansive view of the business and a holistic view of your organization's data, even if you can't possibly integrate it all. Are you a web analyst? Ask what's happening with social data. A social data analyst? How does it compare to market research data? Triangulation helps uncover previously-unseen patterns.

Have a bias for action: Counting in the absence of analysis is a sign you don't understand the problem well enough (see #1, above). Always, always strive to answer the "so what" test. We increased views by five percent. So what? Does that drive revenue? Sharing behavior? Advertising dollars? Did it enable us to reach new audiences? If you can't answer the "so what?" test, you're counting, not analyzing.

Remember that you too are biased (oh yes you are). We learn from experiences and we apply them to our next experiences. Do everything you can to disprove your findings. What assumptions formed your hypothesis? Are they provable? Did sales go up because viewers loved your fabulous content? Or do viewers who are already loyal tend to buy more anyway? The NPR Code Switch blog is a fantastic resource to broaden your thinking. And for heaven's sake, be transparent. Disclose data sources, sample sizes, methodology, demographics to the extent you can.

Tell a story. Not everyone speaks data, and the more data-centric we become, the more important that the left- and-right-brained learn to speak a similar language. Left brainers, tell the story of how you came to your conclusions. Right-brainers, I highly recommend some education in statistics to help you become more fluent. Everyone, use visualization to help illustrate your data. And, in doing so, remember point #3.

Of course, I'm focusing on mindset issues here; there are plenty of tactical recommendations for what communities to join and tools to learn. But before you escape back into your comfort zone, consider this: to be successful in this brave new industry, you have to define it--before it defines you.

Featured Jobs

The Welsh Government

Conwy LL31 9RZ

November 20, 2018

The Energy Systems Catapult

Birmingham, UK

December 16, 2018

AstraZeneca

Cambridge, UK

December 16, 2018

Cabinet Office

London

November 25, 2018

SPD Development Company Limited Clearblue Innovation Centre

Bedford

November 23, 2018

Yale-NUS College

Singapore

November 23, 2018

Greater London Authority

London SE1 2AA

November 25, 2018

The Office for Students (OfS)

Bristol, UK

November 20, 2018

National Audit Office

London

December 02, 2018

The General Medical Council

Manchester, UK

November 25, 2018

Siemens

Congleton UK

November 30, 2018

Ofgem

London

November 19, 2018

AstraZeneca

Cambridge, UK

December 16, 2018

Canal & River Trust

Birmingham, UK

December 06, 2018

AXA PPP Healthcare

Tunbridge Wells

December 06, 2018

Massachusetts Institute of Technology (MIT)

Cambridge, MA

December 05, 2018

Abbott Diabetes Care

Witney, UK

December 09, 2018

The University of Manchester

Stretford, Manchester

November 26, 2018

Our Partners

Logo for Sra
Logo for Directlinegroup
Logo for Amazon
Logo for Logo Open Society Foundations Black
Logo for Hmrc
Logo for Dstl
Logo for Logo
Logo for Prifysgol

Like what you see?

Post a job