Doctorbase

Introduction

A common task in data management systems is receiving data from an external system in XML, JSON, or CSV files and storing it in a relational database. Python includes libraries for easily reading these files and interacting with databases. We store data in databases for record keeping and so that we can answer questions about the data.

In this assignment you will

Problem Description

You’re starting a new job as a junior data scientist at the Centers for Disease Control (CDC). In this role you need to manage and analyze data about doctors and patient care. You receive doctor data from outside sources in the form of XML, CSV and JSON files which you need to insert into your database, and you use the database to answer questions about the data.

Solution Description

Write a Python script to import data from files and insert the data into the database. Create your database with this database schema script: doctors-schema.sql. You should read the database schema script to understand the database.

Import data into your database

Write a Python script named import_doctors.py that takes five command line arguments:

Your script should insert information from the files above into the appropriate tables in the database with appropriate key and foreign key values.

Query your database

Write a SQL script, doctors-queries.sql, that includes queries to answer the following questions:

Your doctors-queries.sql should contain only the SELECT queries requested above.

Tips and Considerations

Turn-in Procedure

Submit your import_doctors.py, doctors-queries.sql to the assignment on Canvas as attachments.

Verify the Success of Your Submission to Canvas

Practice safe submission! Verify that your HW files were truly submitted correctly, the upload was successful, and that your program runs with no syntax or runtime errors. It is solely your responsibility to turn in your homework and practice this safe submission safeguard.