Course Information


Course Information
Course Title Code Semester L+U Hour Credits ECTS
PROBABILITY AND STATISTICS FOR DATA SCIENCE YZM209 3. Semester 3 + 0 3.0 5.0

Prerequisites None

Language of Instruction Turkish
Course Level Bachelor's Degree
Course Type Compulsory
Mode of delivery
Course Coordinator
Instructors Ramazan YAŞAR
Assistants
Goals The course "Probability and Statistics for Data Science" aims to teach the statistical methods and probability theory necessary for supporting decision-making processes in data science. This course provides students with in-depth knowledge of data collection, data analysis, and interpretation of results. Additionally, it aims to equip students with practical skills in understanding, organizing, and analyzing data using the Python programming language.
Course Content This course covers fundamental topics such as data literacy, data manipulation, exploratory data analysis, data visualization, statistical methods, and data preprocessing. Throughout the course, students will learn to analyze datasets using Python, apply data visualization techniques, and develop statistical models. The course will also involve extensive use of data manipulation libraries like NumPy and Pandas to enhance students' skills in data processing and analysis. Furthermore, students will have the opportunity to apply their theoretical knowledge through hands-on practice with various datasets throughout the course.
Learning Outcomes 1) Understanding data literacy, population vs. sample, and observational units.
2) Comprehending variables, types of variables, and central tendency measures
3) Learning distribution measures and the significance of central tendency.
4) Grasping statistical thinking models and organizing, visualizing data.
5) Applying basic data manipulation techniques using NumPy.
6) Performing data manipulation with Pandas, creating DataFrames, element operations.
7) Advanced Pandas operations, grouping, merging, and data reading methods.
8) Conducting exploratory data analysis, visualizing data with Python, descriptive analysis.
9) Creating and cross-analyzing visualizations, histograms, box plots.
10) Analyzing correlation and linear relationships, scatter plot matrix, heat maps.

Weekly Topics (Content)
Week Topics Teaching and Learning Methods and Techniques Study Materials
1. Week Introduction: What is Data Literacy? Population and Sample, Observation Unit Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
2. Week Data and Variables: Variables and Types of Variables, Types of Scales, Measures of Central Tendency: Mean, Median, Mode Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
3. Week Measures of Dispersion: Quartiles, Understanding the Importance of Central Tendency, Range, Standard Deviation, Variance, Skewness, Kurtosis Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
4. Week Statistical Thinking and Data Description: Models of Statistical Thinking, Describing Data, Organizing and Reducing Data, Displaying Data Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
5. Week Data Manipulation with NumPy: Introduction to NumPy, Creating and Understanding Numpy Arrays, Reshaping Arrays, Concatenation and Splitting Arrays, Sorting Arrays Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
6. Week Data Manipulation with Pandas: Introduction to Pandas, Creating and Manipulating Pandas Series, Creating and Manipulating Pandas DataFrames, Selection of Observations and Variables, Conditional Element Operations Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
7. Week Advanced Pandas Operations: Join Operations and Advanced Merging, Aggregation and Grouping, Pivot Tables, Reading External Data, Document Reading Culture Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
8. Week Exploratory Data Analysis and Visualization: Seeing the Big Picture and Representing Data, Data Visualization in Python, Initial Look at Data and Descriptive Analysis, Examining Missing Values Lecture; Discussion

Problem Based Learning
Presentation (Including Preparation Time)
9. Week Data Visualization and Cross-Tabulation: Creating and Cross-Tabulating Bar Charts, Creating Histograms and Density Plots, Creating and Cross-Tabulating Box Plots, Creating and Cross-Tabulating Violin Plots Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
10. Week Correlation and Linear Relationship: Creating and Cross-Tabulating Correlation Plots, Demonstrating Linear Relationships, Scatter Plot Matrix and Heat Map Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
11. Week Basic Statistical Concepts: Sampling Theory and Applications, Descriptive Statistics and Applications, Confidence Intervals, Probability Distributions: Bernoulli, Binomial, Poisson Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
12. Week Hypothesis Testing: What is Hypothesis Testing? Types of Hypotheses, Types of Errors and p-value, Steps in Hypothesis Testing Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
13. Week T Tests and ANOVA: One-Sample T Test and Assumption Check, Independent Two-Sample T Test and Assumption Check, Paired Sample T Test and Assumption Check, ANOVA and Applications Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)
14. Week Correlation Analysis and Conclusion: Correlation Analysis and Assumption Check, Correlation Coefficient Hypothesis Testing, Nonparametric Hypothesis and Correlation Tests, General Evaluation Lecture; Discussion
Opinion Pool
Problem Based Learning
Presentation (Including Preparation Time)

Sources Used in This Course
Recommended Sources
Alpar, R. (2011). Uygulamalı İstatistik ve Geçerlik-Güvenirlik (3rd ed.). Detay Yayıncılık.
Baykul, Y., & Güzeller, C. O. (2014). İstatistik (2nd ed.). Pegem Akademi.
Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications Ltd.
McClave, J. T., Benson, P. G., & Sincich, T. (2018). Statistics for Business and Economics (13th ed.). Pearson.
Montgomery, D. C., & Runger, G. C. (2018). Applied Statistics and Probability for Engineers (7th ed.). Wiley.
Sheskin, D. J. (2011). Handbook of Parametric and Nonparametric Statistical Procedures (5th ed.). CRC Press.
Sipahi, B., Yurtkoru, E. S., & Çinko, M. (2006). Sosyal Bilimlerde SPSS ile Veri Analizi. Beta Yayınları.
Şen, Z. (2012). İstatistiksel Hipotez Testleri (2nd ed.). Çağlayan Kitabevi.
Tabachnick, B. G., & Fidell, L. S. (2019). Using Multivariate Statistics (7th ed.). Pearson.
Yazıcıoğlu, Y., & Erdoğan, S. (2004). SPSS Uygulamalı Bilimsel Araştırma Yöntemleri. Detay Yayıncılık.

Relations with Education Attainment Program Course Competencies
Program RequirementsContribution LevelDK1DK2DK3DK4DK5DK6DK7DK8DK9DK10
PY153333333333
PY253333333333
PY353333333333

*DK = Course's Contrubution.
0 1 2 3 4 5
Level of contribution None Very Low Low Fair High Very High
.

ECTS credits and course workload
Event Quantity Duration (Hour) Total Workload (Hour)
Course Duration (Total weeks*Hours per week) 14 3
Work Hour outside Classroom (Preparation, strengthening) 14 2
Midterm Exam 1 2
Time to prepare for Midterm Exam 1 35
Final Exam 1 2
Time to prepare for Final Exam 1 45
Total Workload
Total Workload / 30 (s)
ECTS Credit of the Course
Quick Access Hızlı Erişim Genişlet
Course Information