What are the four foundations of becoming a good data scientist?

Q:

What are the four foundations of becoming a good data scientist?

A:

As many experts point out, becoming a great data scientist requires a combination of skills and experience that gets built through dedicated learning and analysis of a complex field. Data scientists as administrators and curators of valuable data assets are very much in demand today. Let’s look at what some of these foundational skills involve.

The first of the four fundamental components of data scientist work is mathematics and statistics. Good data scientists should learn to be conversant on various mathematical concepts related to supervised and unsupervised machine learning, including algorithm types such as decision trees, random forest, logistic regression, clustering and the use of dimensionality in machine learning (ML). In general, they should have a good handle on working with mathematical equations and statistics using statistical analysis resources.

The second major fundamental component of data science work involves programming and database management. Individuals should be strong in scripting languages like Python and statistical languages like R, along with experience and skill with database and SQL semantics and operational techniques. Knowledge of software components such as Hadoop, MapReduce, Hive and Pig are also attractive to employers.

The third fundamental component of becoming a good data scientist is the theoretical and philosophical component of understanding data science and machine learning. These individuals should be self-starting problem solvers with curious minds — after all, they are combining raw quantitative analysis with creative understanding of machine learning and data science processes. Rather than just being technical numbers people, they should have a deep grounding in what it means to create machine learning projects and work on data science initiatives in terms of the end goals and end results.

A fourth major pillar of learning to be a good data scientists involves working with people and being able to use data in ways that make sense to other people.

Good data scientists can be storytellers — they can translate quantitative data into narratives and insights. As such, they should have good communication skills to be able to bring their work to the table and disseminate it to multiple stakeholders or a given audience effectively. These are some of the major types of skills that build a good data scientist who is ready to participate in today’s fast-paced and quickly advancing IT industry.

Have a question? Ask us here.

View all questions from Justin Stoltzfus.

Share this:
Written by Justin Stoltzfus
Profile Picture of Justin Stoltzfus
Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.
 Full Bio