scrapeta
ModuleIn this assignment you’ll practice
This is an individual assignment.
Collaboration at a reasonable level will not result in substantially similar code. Students may only collaborate with fellow students currently taking this course, the TA’s and the lecturer. Collaboration means talking through problems, assisting with debugging, explaining a concept, etc. You should not exchange code or write code for others.
Notes:
You’re a CS 2316 and CS 4400 student and you need to get attraction information and put it into a database.
Here’s a skeleton scrapeta.py
to get you started: scrapeta.py
Fill in all the parts with YOUR CODE HERE comments. Read all the comments, which provide a great deal of help. The framework of the script is already written. You only have to write code to:
You’ll need a database. Here’s a database schema script: attraction-schema.sql
requests
and beautifulsoup4
with conda
.Use the Geocoding API to get lat/long coordinates to use in the Places API (more below)
In [1]: import requests
In [2]: geocode_api = "https://maps.googleapis.com/maps/api/geocode/json"
In [3]: resp = requests.get(geocode_api,
...: params={"address": "5 Avenue Anatole France, 75007 Paris, France",
...: "key": "YOUR_API_KEY"})
...:
In [4]: resp.json()
Out[4]:
{'results': [{'address_components': [{'long_name': '5',
'short_name': '5',
'types': ['street_number']},
{'long_name': 'Avenue Anatole France',
'short_name': 'Avenue Anatole France',
'types': ['route']},
{'long_name': 'Paris',
'short_name': 'Paris',
'types': ['locality', 'political']},
{'long_name': 'Paris',
'short_name': 'Paris',
'types': ['administrative_area_level_2', 'political']},
{'long_name': 'Île-de-France',
'short_name': 'Île-de-France',
'types': ['administrative_area_level_1', 'political']},
{'long_name': 'France',
'short_name': 'FR',
'types': ['country', 'political']},
{'long_name': '75007', 'short_name': '75007', 'types': ['postal_code']}],
'formatted_address': '5 Avenue Anatole France, 75007 Paris, France',
'geometry': {'location': {'lat': 48.8582681, 'lng': 2.2945145},
'location_type': 'ROOFTOP',
'viewport': {'northeast': {'lat': 48.85961708029149,
'lng': 2.295863480291502},
'southwest': {'lat': 48.85691911970849, 'lng': 2.293165519708498}}},
'place_id': 'ChIJuX7JjuFv5kcRbLER0b_rtC4',
'types': ['street_address']}],
'status': 'OK'}
Use the Places API with the lat/long you got form the Geocoding API to get nearest transit.
In [41]: places_api="https://maps.googleapis.com/maps/api/place/nearbysearch/json"
In [42]: resp = requests.get(places_api,
...: params={"key": "YOUR_API_KEY",
...: "location": "48.8582681, 2.2945145",
...: "type": "transit_station",
...: "rankby": "distance"})
...:
In [43]: resp.json()
Out[43]:
{'html_attributions': [],
'next_page_token': 'CrQCLAEAAA6e9MN3c5daZEWeP9hKUiwX9KNiBlc_OmDJkMGk56CikUIkwnv0Q76P-PR98bhxYqxpRJpYrvxjR0WaFsx-zcp1hjwGplzkf6o-eJDZEfh5rC3QfzS5GVyAJG1VVZrnxfZzhpTAyazc1DgGVIApyPK_Bi4huK7bAOkz23Xeut1uWO6giQZFiY8fWD2V2zmsFClHyjTpzjgnGZaiSDXKXmktkKh-0NWexMPpCvwyxM7uQAvmXBykdLxmGxfYG_RjPdLYXJQGzbRqzfH-jTzPqd7cY3Ptgz3gX9O7wQTQASgLTzUAicVEhckcLo6BrrlujfRYETdKz8VzgY8Ap2qnsEJSYmPP9xoWqhM8JeNcMjhM0zC7WTaA_Ebf9DWJKLfxCH_szBMLwdecNrKklXNJ7pESECMdnqOEKaG19-9CGnOSkBMaFL0XtFECVIve_AD1UbK95Z3iuUrv',
'results': [{'geometry': {'location': {'lat': 48.8582627, 'lng': 2.292555},
'viewport': {'northeast': {'lat': 48.85963648029149,
'lng': 2.293859880291501},
'southwest': {'lat': 48.85693851970849, 'lng': 2.291161919708498}}},
'icon': 'https://maps.gstatic.com/mapfiles/place_api/icons/bus-71.png',
'id': '9734ec111dac8c7a6dc3f2679f9071709520bbba',
'name': 'Tour Eiffel',
'photos': [{'height': 3264,
'html_attributions': ['<a href="https://maps.google.com/maps/contrib/102179844627858398764/photos">Mélody P</a>'],
'photo_reference': 'CmRaAAAAFn4Ufykt5FMDQiKgZiacdpfbG995mV32R_jNik9G0lWrUTxJ808bxK1nLGN7NEAA2oGlXYV9z-8-PNIZL2rU3QC7OfFvL1sw4ykLgygiDhJb9x4rm5IIYGO54PYRPIs4EhAOiY8h-CWHP5Pr8EhEgI13GhRZ6GOPiOVOLSA_V2cloiMKv1NoVQ',
'width': 4928}],
'place_id': 'ChIJ64R9a-Jv5kcR-BW0JxItLhI',
'plus_code': {'compound_code': 'V75V+82 Paris, France',
'global_code': '8FW4V75V+82'},
'rating': 4.5,
'reference': 'CmRRAAAATcmuQjK6hTsm6HFqML8iZxrs2LNCbNOdMiBdqITfEOOlnzcuho0Xj1V95Afd8k8nYopPiEbT7Ii4FeFbr7rppClLY5g30QlUAGEBjFzeH1qCN70GoBgH-V6mK3-qqBk4EhCW7T6Csfj4Wlr3bLOD6rYmGhQhWIqIBtsUbZIuFWOl5ARFdfmqAg',
'scope': 'GOOGLE',
'types': ['bus_station',
'transit_station',
'point_of_interest',
'establishment'],
'vicinity': 'France'},
... many, many more
Submit your scrapeta.py
file on Canvas as an attachment. When you’re ready, double-check that you have submitted and not just saved a draft.