CS 2803 Data Manipulation for Computer Scientists



Course Description

This course will provide background and experience in reading, manipulating, and exporting data for engineering, business and scientific applications. Specific topics include file I/O, string processing, web scraping, writing HTML and basic interfacing with SQL databases (reading / writing data in pre-existing tables). Students will learn to build programs controlled by basic graphical user interfaces. Assignments will be modeled after business, engineering, and scientific problems.

Learning Outcomes

Student in the class will achieve the following learning objectives:

(Competency) Students will be able to:

  1. Write programs using various data types, and using basic techniques such as assignment, method calls, while loops, for loops, and conditionals.
  2. Use and manipulate several language provided data structures such as: Lists, Dictionaries, and Strings.
  3. Read and write data to and from text files, both as plain text and in structured formats (such as CSV).
  4. Read a textual representation of numerical data and convert it to the appropriate (integer/floating point) data type.
  5. Load HTML pages with a program, and extract specific pieces of information from the HTML.
  6. Write a program that can generate a report in text or HTML format which includes elements under program control.
  7. Connect to existing SQL databases and insert and retrieve data from the database.
  8. Program interactive graphical user interfaces consisting of a graphically organized set of widgets, including a minimum of one from each of the following classes (Label, Button, Text Field).
  9. Implement simple business or mathematical algorithms (calculating interest payments, averaging a row of data, calculating standard deviation) into a program.
  10. Use compound data structures provided by the programming language such as lists, arrays, and dictionaries to hold sequences or sets of data, including two-dimensional (tabular) data.
  11. Use objects and associated methods provided by the programming language.
  12. Write programs that are easy to understand so that others may modify and improve them.

(Movement) Students will increase their:

  1. Familiarity with compound data structures (lists, arrays, dictionaries), including nested data structures (multi-dimensional arrays, etc…) and indexing into multi-dimensional data structures.
  2. Speed and accuracy in converting problem statements into programs.
  3. Understanding of and ability to quickly use basic program structures such as iteration, conditionals, and function calls due to repeated practice of these concepts.
  4. Understanding of the event driven programming model, specifically as applied to graphical user interfaces.
  5. Ability to break a medium sized problem down into smaller parts and solve each sub-problem individually.
  6. Ability to test and debug programs.

(Experience) Students will:

  1. Practice the process of constructing moderately sized (100-300 line) programs from written requirements.
  2. Deal with data that may include missing elements or malformed representations.
  3. Work in pairs to solve programming problems.



Grade Cutoffs: A: 90, B: 80, C: 70, D: 60. No rounding.


Two or three in-class written midterm exams, a final exam, short in-class quizzes, and 7-12 homework assignments. Your last homework assignment may be due the week preceding final exams. Assignments must be turned in before the date and time indicated as the assignment’s due date.

Class Participation

In-class exercises cannot be made up if you do not attend the class. It’s a violation of the Academic Honor Code to submit work or sign in for other students.

Academic Integrity and Collaboration

We expect academic honor and integrity from students. Please study and follow the academic honor code of Georgia Tech: http://www.honor.gatech.edu/content/2/the-honor-code. You may collaborate on homework assignments, but your submissions must be your own. You may not collaborate on in-class programming quizzes or exams.

Due Dates, Late Work, and Missed Work


To contest any grade you must submit an official regrade form to the Head TA within one week of the assignment’s original return date. The original return date is the date the exam was first made available for students to pick up or the grade was posted online in the case of homework assignments and programming quizzes. Note that a regrade means just that – we will regrade your assignment from scratch, which means you may end up with a lower score after the regrade.

Course Outline

This outline applies to Fall and Spring semesters. Summer schedule is compressed into 11 instructional weeks.


At least one of:

Course Materials

Note: O’Reilly books listed below are available through Georgia Tech’s Safari Onine subscription. See http://www.library.gatech.edu/search/ebooks.php


The Institute does not discriminate against individuals on the basis of race, color, religion, sex, national origin, age, disability, sexual orientation, gender identity, or veteran status in the administration of admissions policies, educational policies, employment policies, or any other Institute governed programs and activities. The Institute’s equal opportunity and non-discrimination policy applies to every member of the Institute community.

For more details see http://www.policylibrary.gatech.edu/policy-nondiscrimination-and-affirmative-action