Guided Project: Kaggle Data Science Survey
- Last updated on March 14, 2025 at 2:50 PM
About this Webinar
In this Dataquest Project Lab, Dataquest Sr. Content Developer Anna Strahl walks you through a complete data analysis project using real-world data from Kaggle's Data Science Survey. You'll experience firsthand how to analyze survey data to uncover which skills and experience factors truly impact data science career progression and compensation.
What you'll learn:
- How to clean and prepare survey data for meaningful analysis
- Techniques for aggregating information to uncover patterns in data science careers
- Methods to categorize data for better insights (like grouping years of experience)
- Ways to analyze the relationship between experience and compensation
- Professional approaches to summarizing your findings and determining next steps
- Real-world Python techniques you can apply to your own projects immediately
Key skills covered in this project:
- Working with variables and data types in Python
- Creating and manipulating lists for data organization
- Using for loops to automate repetitive analysis tasks
- Implementing if/else/elif statements for data categorization
- Writing and executing Python code in Jupyter notebooks
- Data visualization techniques to communicate findings effectively
New to Python? Begin with our Python Basics for Data Analysis course to build the foundational skills needed for this project.
Before You Start: Pre-Instruction
To make the most of this project walkthrough, follow these preparatory steps:
1. Review the Project
Access the project and familiarize yourself with the goals and structure:
- Start the project here
2. Access the Solution Notebook:
You can view and download it here to see what we’ll be covering:
Helpful Tips
New to Markdown? We recommend learning the basics to format headers and add context to your Jupyter notebook: Markdown Guide.
For file sharing and project uploads, it is important that you create a GitHub account ahead of the webinar: Sign Up on GitHub.
Want to work offline?
1. Set Up Your Workspace
We’ll work with a .ipynb file, which can be rendered in the following tools:
Jupyter Notebook (local installation required)
Google Colab (browser-based, no installation needed)
2. Download the Resource Files
To follow along with the webinar, you'll need two essential resources: the Basics.ipynb Jupyter notebook that contains all the code and analysis steps we'll explore together, and the kaggle2021-short.csv dataset file which houses the Kaggle survey responses we'll be analyzing.
Having these files downloaded before the session will enable you to code alongside Anna and gain hands-on experience with the techniques demonstrated during the walkthrough.
Next Steps
- Complete the Project: Go here to start this project in-browser.
- Share Your Work: Upload your completed project to GitHub and GitHub Gists. Share it in the Dataquest Community to receive valuable feedback and connect with fellow learners.
- Join the next webinar on March 17 at 12:30-1:30 PM ET: Learn to use essential Python skills to build a fully functional word-guessing game, showcasing your ability to combine different programming techniques to create an engaging user experience. Save your spot today!