College Grads Need These Data Science Skills
The tech world travels at a rapid pace, and even new graduates may not have learned all the skills they need for a career. We review the top skills and discuss how to get them.
Depending on your major, when you graduate from college, you may need to learn additional skills to be more marketable. And, according to LinkedIn, the top three skills that new graduates are learning in the six months following graduation are data visualization, data modeling and Python.
“In 2020, the world will generate 50 times the amount of data it did in 2011,” according to Derek Steer, CEO of Mode, a data analysis platform. Data processing power is now cheap and accessible to practically any company, and Steer says the real bottleneck is finding people with the right skills.
However, companies are expanding the definition of who should have the skills to understand and manipulate data.
“Until recently, the role of predictive analysis fell mostly to experienced, elite data scientists, while natural language processing or creating sophisticated data models was reserved for data professionals with strong engineering backgrounds,” according to Harry Glaser, president of data business at Sisense, which provides tools to help data professionals build analytic apps. “However, market pressure has forced forward-looking analysis to be a regular part of business operations.”
And Glaser says this requires more advanced skill sets. “That means big changes and new demands, which means understanding more data manipulation languages that are often used for advanced analysis, like Python and R.”
Let’s examine these skills, why they are important, and how new grads — or anyone — can learn them.
Data visualization is the top skill listed by LinkedIn, but what is it? “Data visualization is converting data into graphical representations, such as graphs and other more visually appealing formats, in order to provide an effective way of interpreting and understanding a data set,” says Roberto Reif, executive director of data science at Metis, which provides data science training programs.
For example, converting numbers from a spreadsheet into a series of bar or pie charts, makes it easier to digest. “The goal of data visualization is to turn information sets into effective visual storytelling and provide insights in a way that your audience can understand,” Reif says.
And it’s an in-demand skill for several reasons. “Data visualization is not widely taught in school, so new grads with these skills definitely stand out from the crowd,” according to Yi Zou, who is the senior director of engineering and manages the data science product engineering teams at ASML Silicon Valley. “More importantly, good visualization of data permits better insights, leading to better decisions, especially at the exploration phase.”
And there’s yet another reason why this skill is in high demand. “Employees who are able to tell compelling stories with high-quality charts and graphs are typically more effective at clearly communicating their findings,” Zou says. (To learn more, see The Joy of Data Viz: The Data You Weren’t Looking For.)
According to LinkedIn, data modeling is the second most popular skill that recent grads are investing in learning. “Data modeling is all about understanding and using data to find relationships between varying information sets,” Reif explains.
For example, if you plan to put your home on the market and you’re trying to predict the selling price, he says you need to look at a range of data, like square footage, the number of bedrooms and bathrooms, the home’s ZIP code, the area’s crime rate, and the quality of local schools.
“Essentially, data modeling is the art of evaluating data to be able to come up with informed insights and predictions — discovering patterns and relationships,” Reif says.
It’s an in-demand skill because it can help companies to forecast and predict a variety of scenarios to make more informed strategic decisions. “For example, data modeling is used to predict customer churn — whether a company is likely to keep or lose a customer,” Reif explains.
Since it’s more expensive to obtain new customers than to keep them, data modeling can help companies identify customers they’re at risk of losing, so they can take action.
And, Reif says that data modeling is also helpful in combating transaction fraud. “For example, many credit card companies track their customers’ shopping patterns and behavior so that purchases that are suddenly out of the norm can trigger alerts, enabling the companies to immediately contact their customers to confirm the purchase or flag the card.”
You may be wondering why Python would be the third most popular data science skill among recent graduates. “Python is a powerful, general purpose programming language that has emerged in recent times as a language of choice for data science,” explains Dr. Manjeet Rege, professor of data analytics at the University of St. Thomas in St. Paul, Minnesota.
In fact, he says it’s used extensively in data science because it’s more welcoming compared to Java or C++. It’s also popular because it’s an open source program, which means it’s supported by a community and available for free.
“For anybody who wants to work with data that goes beyond an Excel sheet, knowing Python is virtually required,” Reif explains. “While there are other programming languages that are also important and helpful, this is one of the most widely used.”
If you’re apprehensive about learning Python, Rege says you shouldn’t be since a lot of the programming languages are similar. “It’s like learning to drive a car: If you know how to drive a Toyota Camry, most of those skills will translate to driving a Honda Civic — and if you understand one programming language, you’ll pick up another language more quickly.”
Zou admits that he really doesn’t care what software platform is used as long as the data analytics are correctly performed. “However, Python is preferred by most data scientists today since it is the fastest-growing and most popular — and powerful — statistical programming language that performs advanced data analysis, machine learning and visualization on big data sets,” he says.
Rebecca Merrett, lead instructor at Data Science Dojo, which offers data science bootcamps, agrees. “I have observed Python becoming more popular over recent years and would say as a scripting language it does allow you to prototype quickly and also has an extensive list of libraries to help automate the more rudimentary data science tasks.” Like R, she says Python has a lot of support for data science tasks.
How/Where You Can Learn These Skills
You don’t have to be a new grad to learn these skills. Regardless of where you are on the career spectrum, there are plenty of places to learn data visualization, data modeling and Python. Some people like to learn on their own, while others prefer classroom or team settings. Our data science experts provide an extensive and varied list of options.
“New grad candidates I’ve interviewed mention a wealth of resources they’ve used to sharpen their skills, including educational books, YouTube videos, Coursera, and Kaggle competitions,” Zou says. (For more online learning resources, see 6 Key Data Science Concepts You Can Master Through Online Learning.)
Reif adds that in addition to MOOC courses, you can also take college courses or even check out library books. “Metis also teaches all these skills at our data science bootcamp and bootcamp prep courses, which assign pre-work before the bootcamp starts,” Reif says.
One option for learning Python? “Simply go to python.org, download the Python interpreter on your computer and follow the tutorials there,” Rege says.
“Lots of people learn the technical skills through free online resources, like those listed by the Open Source Data Science Masters website (which includes Mode’s own free SQL School and Python tutorials,)” says Steer. “There are also bootcamps like Insight or Galvanize, and online courses from Udacity, Springboard, Datacamp, and others,” he adds.
Stephen Bailey, data scientist and analytics tool expert at Immuta, a data governance platform, has two pieces of advice for people who want to learn these skills. “The first piece is to go do it; go build a visualization in Tableau; go write a simple Python script; go turn some aspect of your life into a spreadsheet.” While you can learn a tool or technique from watching a video, he says you can only learn the art from practice.
His second piece of advice is to go meet people. “The software and data communities are incredibly welcoming,” Bailey says. “Twitter is teeming with people willing to point you in the right direction; also, you can find someone in your LinkedIn community and invite them out for coffee.”
Bailey says you can learn more, and also have more fun, in 30 minutes of talking with someone than in spending a day searching on the internet.