Can You Learn Data Science Online for Free? Yes. Here's How.
Updated: Jul 26, 2019
A reader (Ian) recently asked me what I felt was the best way to learn data science in his spare time.
Great question, Ian, especially considering the field is red-hot right now and shows no signs of slowing down!
In 2018 demand for data scientists grew by 29%. This highlights the 344% increase since 2013 according to reports by Indeed and Dice. The supply of qualified candidates, however, drastically lags behind.
My first instinct was to refer Ian to the contemporary King of AI, Siraj Raval, and his free curriculum “Learn Data Science in 3 Months”. The video, however, was created six months ago (a lifetime in the ever-changing world of data science) and I saw some opportunities to update it.
Some courses were gone or had changed… so I updated/replaced them. Some people prefer a written guide… so I typed it out. Some need to know why a topic is important to understand it… so I added explanations. Finally, (based on personal experience) some can get stuck on any single course no matter how interesting it is. I always like to have an additional course available that may explain something a different way or fill in some knowledge gaps… so I added some alternatives/additions.
Hopefully, this complete curriculum to becoming a data scientist helps Ian and anyone else interested in the field!
1. Learn Python
Tools you’ll use? PythonWhy is it important? Python is growing faster than any other language, it’s extremely well documented, and most of the tools and resources are free. Can you be a data science without knowing Python? Sure. Just like you can drive a car without eyesight. You just won’t get very far.How to learn it?
Massachusetts Institute of Technology (MIT) | Introduction to Computer Science and Programming in Python Kaggle | Python Siraj Raval | Learn Python for Data Science
2. Learn Statistics and ProbabilityTools you’ll use?
MathWhy is it important?
As a data scientist, you’ll have to extract useful information from extremely imperfect data. You can’t completely eliminate uncertainty but you can reduce it with a strong grasp of statistics and probability fundamentals.How to learn it?
Khan Academy | Statistics and Probability UC San Diego | Probability and Statistics in Data Science using Python
3. Learn Data AnalysisTools you’ll use?
Pandas, RWhy is it important?
Data analysis enables you to summarize the characteristics of a data set. This deeper understanding of the data can direct you to the best way to extract useful, actionable conclusions. In short… learn how to understand and clean data. It’s what 90% of your time will be spent doing.How to learn it?
4. Learn Algorithms and Machine LearningTools you’ll use?
Pandas, scikit-learnWhy is it important?
This is likely why you got into data science in the first place! Use Skynet to draw conclusions from the data we mere humans never could.How to learn it?
5. Learn Deep LearningTools you’ll use?
TensorFlow, KerasWhy is it important?
Because everyone’s talking about deep learning so you have to use deep learning always. Alright… not exactly. Good ol’ fashion machine learning is still the best option for most data science endeavors. Deep Learning, however, is making major breakthroughs in certain fields such as image recognition, automation and many more.How to learn it?
6. Learn Relational DatabasesTools you’ll use?
SQL, DB-API, NoSQLWhy is it important?
As a data scientist chances are good you’ll need to access some data. Equally likely is the fact that that data will be stored in databases. Might be a good idea to learn your way around them.How to learn it?
7. Learn Distributed Computing for Big DataTools you’ll use?
Hadoop, MapReduce, SparkWhy is it important?
2.5 quintillion bytes of data are created every day. Let me repeat that… 2,500,000,000,000,000,000 bytes. That’s 2,500 with 15 extra 0’s. If every byte were a single penny, and we laid them all flat, they’d cover the entire Earth… five times. How does a data scientist actually process that kind of data? By filtering and sorting it (MapReduce) and distributing that work over clusters (Hadoop / Spark).How to learn it?
8. Learn Data Presentation and StorytellingTools you’ll use?
Matplotlib, Seaborn, Folium, Excel, PowerPointWhy is it important?
If a tree falls in the woods, but no one’s there to hear it, does it make a sound? What if useful insight is extracted from data, but no one understands it enough to take action, does it serve a purpose? Not really. Data science is useless if the results aren’t actionable. You have to be able to show not just what the data says but why it matters and what should be done about it. An average data scientist with outstanding presentation skills will almost always produce more useful results than the best data scientist who can’t explain them.How to learn it?
About the Company:
Peterson Technology Partners (PTP) has been Chicagoland's premiere I.T. staffing, consulting, and recruiting firm for over 20+ years. Named after Chicago's historic Peterson Avenue, PTP has built its reputation by developing lasting relationships, leading digital transformation, and inspiring technical innovation throughout Chicagoland. Now based in Park Ridge, IL, PTP's 250+ employees have narrowed their focus to a single market (Chicago) and 4 core technical areas;
Application/mobile/web development and ecommerce
Data science/analytics/business intelligence/artificial intelligence
Information security/cybersecurity and
ERP SAP/Oracle and project management/BA/QA
PTP exists to ensure that all of our partners (clients and candidates alike) make the best hiring and career decisions.
About the Author:
Matthew Bardeleben is the Director of Marketing and Public Relations at Peterson Technology Partners. He has earned more than 35 certifications in topics ranging from artificial intelligence, blockchain, and Python programming to digital marketing, growth hacking, and UX/UI design from organizations such as IBM, Google, and HubSpot.