What Does a Data Scientist Do?
Today, the demand for data scientists is growing faster than ever before. Harvard Business Review has declared data science the “sexiest job of the 21st century”, and data scientists are in the Top 20 Fastest Growing Occupations in the US according to the Bureau of Labor Statistics, with 31% projected growth over the next 10 years. With Artificial Intelligence (AI) and Machine Learning constantly in the press, our societal data science needs appear exponential. But what is data science and day-to-day, what do data scientists do? Today we will explore the skills and tasks of a data scientist to demystify these analytical artists.
What is a data scientist?
In the simplest terms, a data scientist practices the art of data science. They solve complex data problems with their expertise in mathematics, statistics, computer science, programming, and other scientific disciplines.
Working with structured and unstructured data, data scientists work to organize, analyze, package, and present data in order to find patterns and extract meaning that may improve efficiency, inform decision making, and increase profitability.
Data scientists use Artificial Intelligence (AI) and Machine Learning to infer from data. Machine Learning uses statistical methods to empower machines to learn without being explicitly programmed, while AI involves making machines imitate a human’s brain functions. You may be familiar with personal assistant systems like Alexa and Siri, household staples like Netflix, Nest thermostats, and things we now cannot live without such as airline route planning, facial recognition, genetics, and automatic spam filtering. All of these systems involve some type of AI and/or Machine Learning. Behind each brilliant system are teams of data scientists dedicated to continually improving efficiency of their product.
Data scientists act as data detectives, working within time and budget constraints, to uncover the What, How, Who, and Why of the data available to them. This means data scientists often rely not only on science and reasoning, but on intuition when making inferences about classifying and evaluating large amounts of data. In addition to the creativity used in data storytelling, you can clearly see why many consider the field not only a science but an art as well.
What skills do data scientists use?
Data scientists work closely with teams and business stakeholders to understand how to best use data to achieve business goals and objectives. They design models, create algorithms, and analyze past and current data to predict future outcomes, inform decision making, and increase profitability.
Regardless of the industry, typically projects for data scientists follow a basic general outline.
- Beginning with a Discovery Process, data scientists work with business stakeholders to determine project goals and objectives by way of asking specific, targeted questions and formulate initial hypotheses to test.
- After acquiring data, a Data Preparation phase helps data scientists pre-process, condition, and clean data prior to modeling.
- Next, Model Planning helps data scientists determine the potential models and algorithms to use and what techniques to apply, such as statistical modeling, Machine Learning, or Artificial Intelligence (AI), to draw relationships between variables and help make sense of the data set.
- Then, the data scientists use Model Building to develop datasets for testing, then use Operationalizing to measure and improve results and provide a clear picture of performance. It is worth noting that although much thought is put into data preparation and model building, a majority of a data scientists’ time will be spent actually analyzing the data.
- Finally, data scientists will Communicate Results to stakeholders using critical data storytelling skills. They will then take feedback, and make needed adjustments, and return back to the Discovery Process as needed.
Taking a closer look, here are some of the most commonly used data science skills:
Data wrangling. Also known as data cleaning or “munging”, this process of gathering, selecting, and transforming data helps identify imperfections such as missing values or inconsistent formatting. This data mapping process will help prepare data for later analysis.
Statistics. Statistical analysis involves identifying patterns and anomalies within the data. Through tests, distributions, maximum likelihood estimators and other methods, data scientists help determine what approaches will work and in conjunction with machine learning. Statistical models help inform stakeholder decision making.
Machine Learning. Data scientists use machine learning algorithms and statistical models within computer programs using “training data” in order to predict and make decisions without being pre-programmed to do so.
Multivariable calculus and linear algebra. Multivariable calculus is used in machine learning to formulate functions used to train algorithms to reach their objective, while linear algebra is used in machine learning to understand how algorithms work under the hood. Examples include derivatives and gradients, step functions, cost functions, plotting functions, minimum and maximum values, scalar, vector, matrix and tensor functions, and more. Both multivariable calculus and linear algebra can be helpful, particularly if a data scientist is working at a company where the product is defined by the data and small performance or algorithm optimization will lead to significant financial outcomes.
Programming. Data scientists write computer programs and analyze datasets to find answers to complex problems. Programming requires writing code using statistical programming languages, like Java, R, or Python, and a database querying language like SQL.
Computer science. Data scientists apply principles of artificial intelligence, database systems, human and computer interaction, numerical analysis, and software engineering to their daily work in order to draw inferences.
Database Management. This includes management of a group of programs that can edit, index, and manipulate a database. Not only will data scientists manipulate data within these databases, but they will write rules within these parameters and work to support multi-user environments within them.
Data storytelling. The ability to communicate a story through data visualization is a compelling way data scientists articulate key findings to technical and non-technical audiences, including business stakeholders. This may include data storytelling through visual mediums such as bar charts, histograms, scatter plots, pie charts, heat maps, 3-D plots, word clouds, or other data mediums.
What are common data scientist jobs?
You may be wondering, what types of jobs do data scientists have? For a glimpse into the life of a data scientist, consider that the field holds a broad set of overlapping skill sets and titles, including data scientists, data analysts, data engineers, business intelligence specialists, data architects, mathematicians, statisticians, computer scientists, and more. Although a majority of data scientists already in the field currently hold Master’s degrees, the numbers of coding school graduates are changing the demographics substantially and many with data science titles may work in non-traditional data science roles as well.
Bloom Institute of Technology (formerly known as Lambda School) Alumni and data science graduate John Morrison works as an Associate Data Scientist tackling data needs for the large health insurance company Florida Blue. Data scientists like Morrison typically design data processes to create algorithms and predictive models and perform analysis.
“With data science, you use problem solving, but you also involve statistics to understand what's going on with the data. And then as a data scientist, you're never just doing something on your own, you're always with a team,” he said.
Morrison works in Florida Blue’s customer service analytics department where he supports three specific teams. For one, Morrison tackles the nitty gritty of cleaning data, making sense of data, and getting the team analysis they need to inform current and future projects. For another team, he uses data to provide reports. And for another, Morrison works in a more client-facing role, connecting directly with clients to better understand the goals and objectives within the various departments.
BloomTech alumni and data science graduate Vera Mendes works as a Business Intelligence Analyst for ZoomInfo. Typically, data analysts manipulate large data sets and use them to identify trends, reach meaningful conclusions, and advise strategic business decisions. In her role, Mendes helps ZoomInfo make hard decisions about where to focus their energy, attention, and revenue. By analyzing data, she forecasts important trends, helping inform stakeholders within her company about whether they should continue with programs or put more efforts into specific marketing initiatives within departments.
“They send the data and ask for insights of whatever is there. They might say, I don't know if this data is going to be helpful or if it’s going to have any good information for us, but tell me what you can take out of this data. What can we do with this? So it’s more about my perspective,” she said.
Many BloomTech Alumni report using the same skills at their jobs that they learned at BloomTech. In fact, our full and part-time data science curriculum provides an intensive online learning experience with an end goal of preparing you for real world data analysis, machine learning engineering, businesses analysis, digital marketing, and more.
Through BloomTech’s live, online Data Science Program, industry experts will teach you:
- Data Visualization
- Machine Learning
- Linear Algebra
- Statistics and Modeling
- Natural Language Processing
- And more!
Through these languages, frameworks, and principles, you’ll learn to analyze data, build reproducible analyses and data-powered systems, understand how to communicate and build on your insights, and produce products with other BloomTech students to add to your portfolio.
Want to learn more about becoming a data scientist or ready to start your data science career at BloomTech? Learn about our live online courses, or if you are ready to apply, start your application now.