Introduction to Data Science with Python
Welcome
Welcome to the course website for
Introduction to Data Science with Python
offered for GRADE Brain and other GRADE Centers at Goethe University in July 2023. This website serves as the central repository for all course materials. Here, you will find all slides, lecture materials, and links to your online development environment.
Course Objective
Data analysis plays a critical role in many academic disciplines, and the Python programming language has become one of the standard tools within the Data Science community. This course will introduce programming with Python and how to use it for data analysis. After successfully completing this course, you will be able to understand the fundamentals of the Python programming language. This skill set includes basic data analysis by data wrangling, data visualization, and implementing simple statistical models in Python.
Our goal is to show you the scope of possibilities within Python and leave you with the impression that you can confidently implement your own empirical projects in Python.
Course Description
This course aims at Python beginners. Hence, we will cover the fundamentals of programming and Python, such as variables, loops, and logic statements, before we dive into the topic of Data Science. This course will not cover deeper statistical or theoretical concepts as we focus on applied coding.
This course introduces:
- Syntax and basics of Python
- How to use Notebooks as a development environment
- Data analysis, data wrangling, and data visualization using
numpy
,pandas
andmatplotlib
- Introduction to implementing simple statistical models in Python with
scikit-learn
Course Organization and Schedule
We will meet on July 28, 2023.
This workshop alternates between lecture-style presentations and application exercises. We aim to adhere to the following schedule:
Part | Content | Date | Time | |
---|---|---|---|---|
1 | Course Introduction and Case Study | 28.07.2023 | 09:00 - 10:00 | |
2 | Fundamentals of Python | 28.07.2023 | 10:00 - 12:00 | |
3 | Data wrangling and visualizations using pandas and matplotlib | 28.07.2023 | 13:00 - 15:00 | |
4 | Advanced visualizations with seaborn, modeling with scikit-learn | 28.07.2023 | 15:00 - 17:00 |
There will be a lunch break after the second part of approximately 60 minutes and additional small breaks after each part.
We want you to make your hands dirty — that means we want you to code! Just following along fancy slides won’t magically transfer the skill of coding to you. But you actively engaging with the course content in your development environment will more likely do just that.
That’s why we need you to prepare accordingly: Please ensure that you have access to Google Colab before the course. We will use Google Colab for the coding parts, such that we can use Python without (sometimes time-consuming) pre-configuration or installation on your machine. To use Google colab, you need a Google account (same account which is used for Gmail, YouTube, etc.)
If you have any questions, please reach out to one of us through the e-mail addresses on the bottom of this page.
ITrainers
Feel free to reach out to us by e-mail if you have any questions before, during, or after the course:
Thilo Kraft, Ph.D. Student in Quantitative Marketing, send me an email
Jan Bischoff, Course Designer and Teacher at TechAcademy e.V., Business and Economics Student, send me an email
Acknowledements
We gratefully acknowledge funding through the project DigiTeLL at Goethe University Frankfurt (partnership Coding Intro).
Thanks to Felix Schneider (Github), who initially developed this course and granted permission adapt and use the course materials.
The case study used as motivation is built upon material from datasciencebox.org by Mine Çetinkaya-Rundel.