Spring 2020 - CSE 351/519

Introduction to Data Science

Syllabus

Basic Information

  • Term: Spring 2020
  • Instructor: Pravin Pawar (pravin.pawar@sunykorea.ac.kr, Office B424, +82-32-626-1227, +82-10-8692-4908)
  • Lectures: Mon & Wed 10:30 AM -11:50 AM
  • Office Hours: Tue & Thu 10:30-12:30 pm in B424 or by appointment. Zoom meeting invitation: https://stonybrook.zoom.us/j/4312768560
  • Course Homepage: http://ppawar.github.io/Spring2020/CSE351-S20/index.html

Course Description

This multidisciplinary course introduces both theoretical concepts and practical approaches to extract knowledge from data. Topics include linear algebra, probability, statistics, machine learning, and programming. Using large data sets collected from real-world problems in areas of science, technology, and medicine, we introduce how to preprocess data, identify the best model that describes the data, make predictions, evaluate the results, and finally report the results using proper visualization methods. This course also teaches state-of-the art tools for data analysis, such as Python and its scientific libraries.

Prerequisites

CSE214 or CSE260; AMS 310; CSE Major

Required Texts

  • (Textbook 1) Steven Skiena, The Data Science Design Manual (Texts in Computer Science) 1st ed., Springer, 2017.
  • (Textbook 2) Ian H. Witten, Frank Eibe, Mark A. Hall, and Christopher J. Pal. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.
  • (Textbook 3) Jake VanderPlas. Python Data Science Handbook: Essential Tools for Working with Data. Shroff/O'Reilly Media, Inc., 2016.

Reference Texts

  • (Reference book 1) Navin Kumar Manaswi, Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech Recognition With TensorFlow and Keras, Apress, 2018.
  • ((Reference book 2) Morgan Peter, Data Analysis From Scratch With Python: Step By Step Guide for Beginners, AI Sciences, 2019.