Guided Project: Analyzing NYC High School Data (Python)
- Last updated on January 31, 2025 at 6:11 AM
About this Webinar
In this Dataquest Project Lab webinar, we walk you through a data analysis project using Python to investigate New York City high school data, focusing on SAT scores and their correlations with various socioeconomic factors. This session is ideal for learners with basic Python skills who are ready to apply their knowledge to a real-world dataset.
Key Takeaways:
- Understand the relationship between SAT scores and factors like income, race, and school safety.
- Learn how to clean and merge multiple datasets into a single structured dataframe.
- Explore different data visualization techniques to identify trends and correlations.
- Gain insight into the limitations of correlation analysis and how to frame data-driven conclusions.
- See real-time debugging and problem-solving strategies in a Jupyter Notebook.
This session is perfect for learners who want to improve their exploratory data analysis (EDA) skills while working on a structured project that mimics real-world data challenges.
New to Python? Begin with our Python Basics for Data Analysis course to build the foundational skills needed for this project.
Before You Start: Pre-Instruction
To make the most of this project walkthrough, follow these preparatory steps:
1. Review the Project
This is a premium project that we're offering for free exclusively for this webinar and will be open for a whole week. Access the project and familiarize yourself with the goals and structure:
- Start the project here
2. Access the Solution Notebook:
You can view and download it here to see what we’ll be covering:
Helpful Tips
New to Markdown? We recommend learning the basics to format headers and add context to your Jupyter notebook: Markdown Guide.
For file sharing and project uploads, it is important that you create a GitHub account ahead of the webinar: Sign Up on GitHub.
Want to work offline?
1. Set Up Your Workspace
We’ll work with a .ipynb file, which can be rendered in the following tools:
Jupyter Notebook (local installation required)
Google Colab (browser-based, no installation needed)
2. Download the Resource Files
We’ll be using eight data files in the webinar. Please download them from the Jupyter Lab tab on our platform to follow along on your local machine.
Next Steps
- Complete the Project: Go here to start this project in-browser.
- Share Your Work: Upload your completed project to GitHub and GitHub Gists. Share it in the Dataquest Community to receive valuable feedback and connect with fellow learners.
- Join the next webinar on February 13th, 12:30-1:30 PM EST: Analyze a dataset on helicopter prison escapes, identifying key trends and developing data storytelling skills. Save your spot today!