Guided Project: Predict Heart Disease
- Last updated on April 11, 2025 at 7:23 PM
About this Webinar
In this hands-on Project Lab, Dataquest’s Senior Content Developer, Anna Strahl, walks you through how to build a K-Nearest Neighbors classifier to predict the likelihood of heart disease based on patient data.
Throughout the session, Anna breaks down each step of the machine learning workflow, shares practical tips, and answers live questions from learners.
This project is ideal for learners familiar with Python, Pandas, NumPy, Matplotlib, Seaborn, and basic ML concepts.
What you'll learn:
- How to clean and prepare real-world healthcare data for analysis
- Techniques for conducting exploratory data analysis to identify key patterns
- Methods to select relevant features that impact heart disease prediction
- Ways to implement and optimize a K Nearest Neighbors classifier
- Professional approaches to evaluating model performance and accuracy
- Real-world Python techniques you can apply to your own healthcare data projects
Key skills covered in this project:
- Working with pandas to load, explore, and manipulate healthcare datasets
- Using visualization libraries to identify patterns in medical data
- Implementing machine learning workflows for classification problems
- Applying K Nearest Neighbors algorithms to real patient data
- Tuning hyperparameters to optimize model performance
- Evaluating the effectiveness of predictive healthcare models
New to Python? Begin with our Python Basics for Data Analysis course to build the foundational skills needed for this project.
New to machine learning? Begin with our Machine Learning in Python course to build the foundational skills needed for this project.
Before You Start: Pre-Instruction
To make the most of this project walkthrough, follow these preparatory steps:
1. Review the Project
Access the project and familiarize yourself with the goals and structure:
- Start the project here
2. Access the Solution Notebook:
You can view and download it here to see what we’ll be covering:
Helpful Tips
New to Markdown? We recommend learning the basics to format headers and add context to your Jupyter notebook: Markdown Guide.
For file sharing and project uploads, it is important that you create a GitHub account ahead of the webinar: Sign Up on GitHub.
Want to work offline?
1. Set Up Your Workspace
We'll work with a .ipynb file, which can be rendered in the following tools:
- Jupyter Notebook (local installation required)
- Google Colab (browser-based, no installation needed)
2. Download the Resource Files
To follow along with the webinar, you'll need the heart_disease_prediction.csv dataset which contains anonymized patient data from multiple hospitals. This dataset is from the Heart Failure Prediction Dataset on Kaggle.