Welcome to my data science portfolio page where showcases a collection of my projects on Python and R programming, machine learning algorithms, data analysis, and visualization. In this portfolio, I have worked on various datasets ranging from predicting forest fires and heart disease, to building a spam filter and analyzing New York City school data. The goal is to share my passion for data science and how it can be used to solve real-world problems. Each project demonstrates a different aspect of data science and I hope they inspire you to explore the endless possibilities of this field.
Deep neural networks have been used to reach state-of-the-art performance on image classification tasks in the last decade. We will explore why it is a hard task and observe the limitations of traditional machine learning models for image classification.
Applying a standard linear regression model to predict the extent of fire damage to a forest. Our data comes from the Forest Fires dataset from the UCI Machine Learning Repository.
Satisfying the huge global demand for garment products is mostly dependent on the production and delivery performance of the employees in the garment manufacturing companies.
Applying logistic regression model at a real-life dataset: the Heart Disease Data Set from the UCI Machine Learning Repository to predict heart disease, showing how machine learning can help solve problems that have a real impact on people’s lives.
Creating a model that can predict how many people will be at the gym at a particular day and time. That way, I will be able to enjoy my excersise routine without waiting times.
This dataset contains information on individual medical insurance bills, which is associated with some demographic and personal characteristics of the person who received it.
Given a dataset containing information about the company’s clients and asked to help segment them into different groups in order to apply different business strategies for each type of customer.
Build a K-Nearest Neighbors classifier to accurately predict the likelihood of a patient having a heart disease in the future. It is imperative to identify these risk factors early on to prevent the onset of CVDs and reduce premature deaths.
Jeopardy is a popular TV show in the US where participants answer questions to win money. I am going to work with a dataset of Jeopardy questions to figure out some patterns in the questions that could help to win.
Build a spam filter for SMS messages using the multinomial Naive Bayes algorithm. Our goal is to write a program that classifies new messages with an accuracy greater than 80%.
By leveraging relevant data sources and analytical tools, we aim to provide actionable insights to inform the company’s advertising decision-making process.
In this project, we will focus on explanatory data visualization and practice how to use information design principles (familiarity and maximizing the data-ink ratio) to create better graphs for an audience.