{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "Take notice:\n", "\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Week 2: Python Review" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data types revisited\n", "\n", "### Let's start with some data\n", "\n", "Let's look at some of the variables related to [metro stations in Los Angeles](https://developer.metro.net/). For each station, a number of pieces of information are given, including the name of the station, the line number (linenum), stop number (stopnum) its latitude (lat), its longitude (long). We can store this information and some additional information for a given station in Python as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station = 'Westwood / Rancho Park'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "linenum = 806" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "stopnum = 80134" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lat = 34.0368" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "long = -118.425" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have 5 values assigned to variables related to a single observation station. Each variable has a unique name and they can store different types of data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reminder: Data types and their compatibility\n", "\n", "We can explore the different types of data stored in variables using the `type()` function.\n", "Let's use the cells below to check the data types of the variables `station`, `linenum`, and `lat`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(linenum)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(lat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, we see that the `station_name` is a character string, the `station_id` is an integer, and the `station_lat` is a floating point number." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Note**\n", "\n", "Remember, the data types are important because some are not compatible with one another. What happens when you try to add the variables `station` and `linenum` in the cell below?\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we get a `TypeError` because Python does not know to combine a string of characters (`station`) with an integer value (`linenum`). Do you remember how to fix it? Use the cell below to print `station` and `linenum` in one line of code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Converting data from one type to another\n", "\n", "It is not the case that things like the `station` and `linenum` cannot be combined at all, but in order to combine a character string with a number we need to perform a *data type conversion* to make them compatible. Let's convert `linenum` to a character string using the `str()` function. We can store the converted variable as `linenum_str`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "linenum_str = str(linenum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can confirm the type has changed by checking the type of `linenum_str`, or by checking the output when you type the name of the variable into a cell and running it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(linenum_str)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "linenum_str" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, `str()` converts a numerical value into a character string with the same numbers as before." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**Note**\n", "\n", "Similar to using `str()` to convert numbers to character strings, `int()` can be used to convert strings or floating point numbers to integers and `float()` can be used to convert strings or integers to floating point numbers.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Combining text and numbers\n", "\n", "Although most mathematical operations operate on numerical values, a common way to combine character strings is using the addition operator `+`. Let's create a text string in the variable `station_name_and_id` that is the combination of the `station` and `linenum` variables. Once we define `station_name_and_id`, we can print it to the screen to see the result." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_name_and_id = station + \": \" + str(linenum)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_name_and_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that here we are converting `linenum` to a character string using the `str()` function within the assignment to the variable `station_name_and_id`. Alternatively, we could have simply added `station` and `linenum_str`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lists and indices\n", "\n", "Above we have seen a bit of data related to one of the stations on the LA Metro Expo line. Rather than having individual variables for each of those stations, we can store many related values in a *collection*. The simplest type of collection in Python is a **list**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating a list\n", "\n", "Let’s first create a list of selected `station_name` values and print it to the screen." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names = ['Pico', 'Culver City', 'Westwood / Rancho Park', 'Downtown Santa Monica']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also check the type of the `station_names` list using the `type()` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have a list of 4 `station_name` values in a list called `station_names`. As you can see, the `type()` function recognizes this as a list. Lists can be created using the square brackets `[` and `]`, with commas separating the values in the list." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Index values\n", "\n", "To access an individual value in the list we need to use an **index value**. An index value is a number that refers to a given position in the list. Let’s check out the first value in our list as an example by printing out `station_names[1]`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wait, what? This is the second value in the list we’ve created, what is wrong? As it turns out, Python (and many other programming languages) start values stored in collections with the index value `0`. Thus, to get the value for the first item in the list, we must use index `0`. Let's print out the value at index `0` of `station_names` below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, that makes sense, but it may take some getting used to..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Number of items in a list\n", "\n", "We can find the length of a list using the `len()` function. Use it below to check the length of the `station_names` list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Index value tips\n", "\n", "If we know the length of the list, we can now use it to find the value of the last item in the list, right? What happens if you print the value from the `station_names` list at index `4`, the value of the length of the list?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [], "source": [ "print(station_names[4])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An `IndexError`? That’s right, since our list starts with index `0` and has 4 values, the index of the last item in the list is `len(station_names) - 1`. That isn’t ideal, but fortunately there’s a nice trick in Python to find the last item in a list. Let's first print the `station_names` list to remind us of the values that are in it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To find the value at the end of the list, we can print the value at index `-1`. To go further up the list in reverse, we can simply use larger negative numbers, such as index `-4`. Let's print out the values at these indices below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names[-1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names[-4])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yes, in Python you can go backwards through lists by using negative index values. Index `-1` gives the last value in the list and index `-len(station_names)` would give the first. Of course, you still need to keep the index values within their ranges. What happens if you check the value at index `-5`?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "raises-exception" ] }, "outputs": [], "source": [ "print(station_names[-5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Modifying list values\n", "\n", "Another nice feature of lists is that they are *mutable*, meaning that the values in a list that has been defined can be modified. Consider a list of the observation station types corresponding to the station names in the `station_names` list." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's change the value for `station_names[1]` to be `'Palms'` and print out the `station_names` list again." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names[1] = 'Palms'\n", "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data types in lists\n", "\n", "Lists can also store more than one type of data. Let’s consider that in addition to having a list of each station name, FMISID, latitude, etc. we would like to have a list of all of the values for station ‘Helsinki Kaivopuisto’." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_westwood_rancho = [station, linenum, lat, long, linenum]\n", "print(station_westwood_rancho)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have one list with 3 different types of data in it. We can confirm this using the `type()` function. Let's check the type of `station_hel_kaivo`, then the types of the values at indices `0-2` in the cells below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station_westwood_rancho)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station_westwood_rancho[0]) # The station name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station_westwood_rancho[1]) # The FMISID" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(station_westwood_rancho[2]) # The station latitude" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding and removing values from lists\n", "\n", "Finally, we can add and remove values from lists to change their lengths. Let’s consider that we no longer want to include the first value in the `station_names` list. Since we haven't see that list in a bit, let's first print it to the screen." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`del` allows values in lists to be removed. It can also be used to delete values from memory in Python. To remove the first value from the `station_names` list, we can simply type `del station_names[0]`. If you then print out the `station_names` list, you should see the first value has been removed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "del station_names[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we would instead like to add a few samples to the `station_names` list, we can type `station_names.append('List item to add')`, where `'List item to add'` would be the text that would be added to the list in this example. Let's add two values to our list in the cells below: `'Pico'` and `'Farmdale'`. After doing this, let's check the list contents by printing to the screen." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names.append('Pico')\n", "station_names.append('Farmdale')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, we add values one at a time using `station_names.append()`. `list.append()` is called a method in Python, which is a function that works for a given data type (a list in this case). We’ll see some other examples of useful list mtehods below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Some other useful list methods\n", "\n", "With lists we can do a number of useful things, such as count the number of times a value occurs in a list or where it occurs. The `list.count()` method can be used to find the number of instances of an item in a list. For instance, we can check to see how many times `'Palms'` occurs in our list `station_names` by typing `station_names.count('Palms')`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names.count('Pico') # The count method counts the number of occurences of a value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, we can use the `list.index()` method to find the index value of a given item in a list. Let's use the cell below to find the index of `'Palms'` in the `station_names` list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names.index('Pico') # The index method gives the index value of an item in a list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The good news here is that our selected station name is only in the list once. Should we need to modify it for some reason, we also now know where it is in the list (index `0`)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting a list\n", "\n", "The `list.sort()` method works the same way. Let's sort our `station_names` list and print its contents below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_names.sort() # Notice no output here..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(station_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the list has been sorted alphabetically using the `list.sort()` method, but there is no screen output when this occurs. Again, if you were to assign that output to `station_names` the list would get sorted, but the contents would then be assigned `None`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Looping\n", "Loops in Python let us iterate over a list. It can come in handy when we are dealing with lots of data, and we want to do something to each row in the data.\n", "\n", "Pay close attention to the syntax in a `for` loop. Notice the semi-colon at the end of the `for` statement, and the indentation that follows." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# loop through a list\n", "for station in station_names:\n", " print(station)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's do something to each element of our list. For example, we can add some text to each element:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for station in station_names:\n", " print('Expo line station: ' + station)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# enumerate lets you loop though a list and count along\n", "for count, station in enumerate(station_names):\n", " print(count, station)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }