Do planets really affect our lives?

Every planet plays a role in how you feel, think, grow and do all the things you do!

Because of the intensity of a “Saturn Return”, this astrological transit often receives a lot of attention. Here we will explore the impacts of this transit using NLP(Natural Language Processing) techniques.

  1. Topic Background
  2. Dataset Overview
  3. Data Cleaning & Preprocessing
  4. Data Visualization
  5. Sentiment Analysis
  6. Topic Modeling
  7. Fetch Similar Documents Using Search Query
  8. Text Summarization
  9. Conclusion

1. Topic Background:

  • Saturn is the 2nd largest planet in the solar system which takes around 29.5 years to orbit the Sun and stays approximately…

Predict Insurance Charges using different Linear Regression Models and compare results.

Predict Insurance Charges

Here I will discuss how Linear Regression works and how can we implement it in different ways to achieve best accuracy.

Data set overview:

I have taken health insurance data set for analysis. It contains 1338 samples and 7 features.

Here we want to predict insurance charges using given features like age, sex, bmi, children, smoker and region.

You will be able to download data from here.

Web Scraping is the process of importing information from a website into local files saved on your computer. Later on you can use that information for analysis purpose.

Here, we will see how we can scrap YouTube comments and generate csv file from scraped data.

  1. Install required packages:
! pip install pandas
! pip install datakund
! pip install youtube_comment_scraper

2. Import required packages:

This will open your web browser.

from youtube_comment_scraper import *
import pandas as pd

3. Open YouTube video link and go to the comment section:"")

Move to the next step when you see below…

How will you deal with unknown categories which were not part of your training set?

Answer: set handle_unknown=’ignore’ in OneHotEncoder


Let’s consider below as training data set:

How will you evaluate your regression model?

What is Model Evaluation in machine learning?

  • Model Evaluation is the process to check how accurate or inaccurate our model is.
  • Scikit-learn library provides different types of evaluation metrics which we can use to check model performance.
  • We can implement our own evaluation metrics also based on our data set and domain requirement.

Common Regression Evaluation Metrics:

  1. Mean Absolute Error (MAE)
  2. Mean Squared Error (MSE)
  3. Root Mean Squared Error (RMSE)
  4. R-squared

1. Mean Absolute Error (MAE):

  • Used to check how inaccurate our model is.
  • MAE = Sum of absolute error/number of samples
  • Absolute Error = Absolute difference between actual and predicted data.
  • Below it shows predicted values from 2…

Classification Model

Predict Traffic Accident Severity

Road safety should be a prior interest for governments, local authorities and private companies investing in technologies that can help reduce accidents and improve overall driver safety.

Here we will analyze historic collision data and prepare classification model to predict future events.


Download data set from:

Shape of a data set: 194673 samples, 38 features

Priyanka Dave

Data Science Enthusiast

