With all the investments made in analytics, it’s time to stop buying into partial solutions that overpromise and underdeliver. It’s time to invest in answers. Only Teradata leverages all of the data, all of the time, so that customers can analyze anything, deploy anywhere, and deliver analytics that matter most to them. And we do it at scale, on-premises, in the Cloud, or anywhere in between.
We call this Pervasive Data Intelligence. It’s the answer to the complexity, cost, and inadequacy of today’s analytics. And it's the way Teradata transforms how businesses work and people live through the power of data throughout the world. Join us and help create the era of Pervasive Data Intelligence.
As a Data Scientist, you will perform analysis, and be responsible for implementation and support of large scale data and analytics for our clients. You will work in a team whose data science efforts range from exploration and investigation to the design and development of entire analytical systems.
You will demonstrate your technical leadership in extracting meaning from large scale, unstructured data and in working with engineering teams to integrate with underlying systems as provisioned by clients or fielded by Teradata personnel.
Additional responsibilities will include providing big data solutions for our clients, including analytical consulting, statistical modelling and other quantitative solutions. You will mentor sophisticated organizations on large scale data and analytics and work closely with client teams in delivering meaningful results. You will help translate business cases to clear research projects, be they exploratory or confirmatory, to help our clients utilize data to drive their businesses. In this role you will also collaborate and communicate across geographically distributed teams and with external clients, extensively.
- Academic coursework in mathematics, statistics, machine learning and data mining
- Proficiency in at least one of R, Python, Matlab, SAS (or other comparable mathematical/statistical software packages)
- At least complementary experience with Java and Python
- Adept at learning and applying new technologies
- Excellent verbal and written communication skills
- Strong team player capable of working in a demanding start-up environment
- Preferred Knowledge, Skills and Abilities:
- Experience in at least two of those fields: web analytics, social network analysis, advanced time series modelling, natural language processing, optimization, signal processing
- Development of deep learning solutions using e.g. TensorFlow and Keras
- Well versed in applying, interpreting, and communicating linear and non-linear models, as well as statistical hypothesis tests, non-parametric statistical methods, ensemble learners, and unsupervised machine learning algorithms for clustering and dimensionality reduction
- Modelling outcomes using kernels, nearest neighbours, LASSO, or equivalent techniques
- Bagging, boosting, and stacking models to generate meta-models
- Core programming, text file manipulation, and statistics with Numpy, Pandas, Scikit or R equivalents
- Leveraging Apache Spark for data preparation, data transformation and the development of machine learning models (ML or MLlib)
- Familiarity with the data science offerings of major cloud platform providers like AWS, GCP, and Azure
- Exporting, importing, aggregating, and filtering data in conjunction with relational databases using e.g. SQL, Hive, Pig
- Cleaning, manipulating, and formatting data stored in flat files or obtained by interacting with RESTful APIs
- Writing jobs to read, filter, manipulate, and aggregate data stored in Hadoop with one of the predominant APIs: Spark, Java MR, Hadoop/Spark Streaming
- Generating data profiles and visualizations including measures of central tendency, measures of deviation, and correlations in R, Python or other "non-big-data" technologies
- Generating data profiles and visualizations including measures of central tendency, measures of deviation, and correlations over Hadoop & Spark or other big-data technologies
- Design, develop and implement dashboards & reports using R-Shiny, Ipython/Jupyter Notebooks, Zeppelin
- Experience at writing industry-grade software applications, including version control systems, agile development processes, test automation, and CI/CD pipelines
- Experienced in estimating the time needed to complete assigned tasks and ability to deliver in that time period
- Familiarity with containerized virtualization solutions, e.g. Docker and Kubernetes
- Ability to work efficiently on the (Linux) command line, including the usage of pipes, remote terminals, and DevOps
- Ability to write technical reports for projects and non-technical documents that communicate solutions and findings in an engaging and precise manner
- Skilled in delivering presentations during client meetings, conferences or sales events explaining data science solutions and strategies to technical and non-technical audience.