The Most Effective Way to Learn Data Science!

Prabakaran Chandran
7 min readDec 12, 2021

--

Let’s Make the Learning Data Science Efficient and Impactful!

Introduction

Hello Everyone! Thanks for reading my article.

Every year the number of Students, Professionals stepping into Data Science is increasing Exponentially. Almost everyone follows the same curriculum to learn data science which concentrates more on python, machine learning algorithms, and nowadays the momentum is inclined more towards NLP and Deep learning ( Sometimes it is good but the Easier, Interesting and Important areas are being ruled out )

Even though there are plenty of resources and structured curriculums in place, Many of the beginners are struggling to learn them properly and build their skillsets demonstratable. The main reason could be random and distracted learning. This makes them still be in the Fresher zone instead of Moving ahead to the practitioner / Jobseeker with the skill set zone

In this article, we will be discussing the most effective ways to learn Data Science and its related fields. This can be applicable to any other fields as well, but we will have examples on Data Science

Having said that, Let’s get into the topic!

Prepare a Learning and Practicing Environment

Before starting to learn any concepts in Data Science, Make sure the following things are set up.

  1. Learning Curriculum and Plan
  2. Learning Support System
  3. Practicing Environment

Learning Curriculum :

Selecting the correct reference curriculum will be a great head start to the learning process. Rather than going behind the random/incomplete youtube videos, It is suggested to follow a standard curriculum designed by either university or established learning platforms like Coursera, Udemy, LinkedIn learning, upGrad

The venues mentioned above are easily accessible and the curriculum can be downloaded/ accessed without any cost. Make sure you have an account in it.

Few Standard Data Science Curriculums that I prefer:

  1. https://www.coursera.org/specializations/jhu-data-science
  2. https://www.coursera.org/specializations/applied-data-science
  3. https://www.coursera.org/specializations/data-science-python

If you could afford to buy the courses You can purchase otherwise, you can audit the courses, But Considering the structured curriculum will help in anyways

Learning Support System :

Learning Support system is nothing but the Data Science Community that spread across the globe and is accessible via LinkedIn, Twitter, Medium. Connecting with these communities will give us hope, Guidance, the Latest updates, help.

Try to Create a Linkedin Profile ( Even if you are at milestone zero), a Medium account to follow hashtags (topics) like #datascience #dataanalytics #machinelearning #statistics #datastorytelling #dataengineering. This will help in getting the interesting topics to learn, articles that explain the concepts easier manner

Practicing Environment :

Players can only shine in this field (almost any domain ), Spectators can not sustain here! Watching either youtube videos or skimming the articles would not help in moving to the next milestone. As we always Emphasize, learning by doing is one of the keys to success here. To do so, Environment to Practice must be ensured. Neccassy IDEs ( Pycharm , Visual Studio Code ), Notebook Servers are required

We use to practice in Jupyter notebooks most of the time, so JupyterNotebook using Anaconda Navigator can be installed, Set up your Kaggle or Colab Accounts to use GPU instances.

AWS has launched AWS Sagemaker Studio lab for free, in which we can learn existing courses for free along with the Notebooks-based tutorials.

DataSpell is an Exclusive IDE for Data Scientists launched by JetBrains ( I prefer this )

https://studiolab.sagemaker.aws/

Plan, Track, and Score!

Before starting your learning, Dissect the Above Curriculum into multiple sprints ( Consider your Free time for learning if you are working )

  1. For Each sprint, Define at least Two Objectives and their key results. For example, If your objective for the Month of Jan 2022 is Learning Pandas, Matplotlib, Your Key results should be a) Complete the Kaggle Data Analysis and Visualization Course, b) Create EDA and Data Storytelling presentation for 2 Case studies (Customer Engagement Survey, Grocery store Sales) c) Create 2 EDA Notebooks on Kaggle using Open Datasets
  2. As per the above example, Key results should be quantifiable, should always be trackable
  3. Conduct a personal cadence/review of your progress every week, If you find any delays find out the reason and record it. Don’t overrate your performance
  4. Track all of them in a structured Template, So that your learning can be revisited at any time. Use Notion ( https://www.notion.so/ ) to do this
  5. You can create a learning Journal using notion, where you can record all your learnings and save the important notes, links, resources ( This will help during interviews and quick revisions. For me, it had helped me during a Project proposal designing phase)

Collaborative Learning of Mathematical Concepts and Programming :

Most of the time Programming is learned independently, without any collaboration with Mathematical concepts ( Most of the College Curriculums are designed like this — Maths / Stats Lessons must be learned along with Programming )

If you are learning any Mathematical Concepts, Try to convert the concepts into Python Functions or Classes. This will improve the Better Coding Practices and while writing the formulae/algorithms in bare Python code, that will help in understanding the roots.

For example, If you are learning Probability Distributions, Statistical tests, Try to practice them with the help of Python then and there itself.

Explore the Real-time use cases of the topics you learned and record them as case studies, This will help either while we are at solution designing tasks or during the interview preparation

Recreate and Represent the Tougher Topics: Maths, Stats, ML Algos :

When you move on to the miles stones, Concepts would become tougher and Overwhelming. Recreation and Representation of the topics will make the learning effective here.

When you learn tougher topics, algorithms, and Mathematic intuitions, Follow the method

  1. Collect the blogs, articles, lecture notes available for the Topic
  2. Extract the Core concepts, Formulae, Pseudocode
  3. Use Code Implementations to understand the real-time implementation
  4. Create an Intuitive story out of it, Recreate the story with simple layman terms
  5. Explain the Formulae/ algorithm step with the easier example ( Need to practice, again and again, This will help in laying a strong Foundation )
  6. Present that as a Presentation or a Document and Try to teach your peers

For Example: Refer to my Deck on Agent-Based Modeling https://docs.google.com/presentation/d/1jL5pipbAR_2m74-m1rzNunJEKzz0HwUb8rewbGe808s/edit?usp=sharing

The Method of Data Science + X :

Building projects and portfolio is the major goal in learning data science, that is the evidence of your learning and efforts. The common mistake is templated projects and portfolios. If you build a project on simple generic data sets like Titanic, Boston house price prediction, that will not demand the data science skills from you. Rather than stick to the conventional project portfolios, Try to create a unique one, which demands you to learn continuously and explore further

  1. Select a Domain X, Here X could be your College degree related domain or your area of Interest (agriculture, construction, city planning, disaster management)
  2. Explore the scope of Analytics, Data Science in the selected domain, collect the case studies ( so that you can gain Domain Knowledge also)
  3. Figure out the Existing problems in the selected domain, For Example in the field of Construction, a few existing problems are 1. Monthly Budget planning, 2. Quality assessment of Materials / constructed parts of the building, 3. Architectural plan, 3. Site selection
  4. Take one particular problem and collect the existing solutions either from Published papers or case study blogs.
  5. Implement them as it is mentioned in the resource papers, case studies — This will help you in learning how to implement the persistence solution
  6. Then ,Try to infuse your flavor of Feature Engineering, Modeling, Parameter Adjustments
  7. Wrap your project as a Sharable Python Scripts and API ( use Fast API / Flask or Streamlit to consume)
  8. Prepare the document, Present your project on GitHub, LinkedIn, Medium and Ask for Suggestions.

Learn In Public + Build In Public :

Learning and Building in public is a principle that will make you more committed towards your goal which makes the learning much impactful. This is a method that will enable you to learn faster by taking the feedback of others. We need to tweet about what we learn!

This will help to track the progress and collectively look into the learnings. If you are learning very unique concepts like discrete event simulation, agent-based modeling, Domain-specific ML applications, Learning in Public will connect yourself with the experts, practitioners via Twitter, LinkedIn

Sometimes, guidance/appreciation will come from the Creators / Authors. When I posted in public while learning GraphDatabase, I got guidance & appreciation from the neo4j team and the Arango DB team

If you can do this consistently, you will start getting feedback, meet experts and practitioners, and build an audience online.

Convert your learnings into results:

Building projects on all the concepts might be a tedious one, but we can write blogs, Posts, Flashcards, Cheatsheets on that. Bring your way of Explanation here!

Engage with the Community :

Engaging with the Data Science community will help in evaluating the position, this will bring a different perspective, how others are solving the problems ?, what are the tools and tech being used?

Club House sessions are a great way to interact with industry experts to know these things and help in bridging the knowledge and exposure gap, This removes the communication and accessibility barrier as well. I have seen many people emphasize more on blindly learning NLP straight away.

Twitter is the place where all the updates are being announced by Tech Companies, Opensource Contributors, Research Foundations.

Attend Virtual meetups that are hosted by the Tensorflow user group, Mircosoft Reactor, Google Developers Group, etc...

Ending Notes :

Hope! This article helps you and provides you with valuable ideas.

So far we have discussed many ways to learn Data Science in an Effective and Impactful manner that will bring high returns. If you could not follow all of the mentioned ways, it is Okay to Follow a few of them.

A small adaption will help to make a better transition and impact.

If you like this article, Clap, Share it with your friends and peers

Feedbacks, Comments are Welcome!

Stay tuned with me for interesting articles on various topics!

#Learn with Karan #LearnTogether #GrowTogether

To About Me — Check my Linkedin Profile

https://www.linkedin.com/in/prabakaranchandrantheds/

--

--