{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "ew7je-M7tXgG" }, "source": [ "# 2. Accessing Data\n", "\n", "*Damian Trilling and Penny Sheets*\n", "\n", "This notebook is meant to show you different ways of accessing data. Data can be available as (a) local files (on your computer), (b) remote files (somewhere else), or (c) APIs (application programming interfaces). We will show you ways for dealing with all of these." ] }, { "cell_type": "markdown", "metadata": { "id": "wM-SYGSDtXgL" }, "source": [ "But before we do that, we need to import some modules into Jupyter that will help us find and read data. You already know our basic module, pandas. Let's import it again just in case your computer cleared it during the break (or in case you're doing this notebook again separately, after class)." ] }, { "cell_type": "markdown", "metadata": { "id": "i39kXB4HtXgP" }, "source": [ "### Importing Modules\n", "It is a good custom to import all modules that you need at the beginning of your notebook. We'll explain in the lesson (or in subsequent weeks) what these modules do." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "Yccoxp9ftXgR" }, "outputs": [], "source": [ "import pandas as pd\n", "from pprint import pprint\n", "import json\n", "import matplotlib.pyplot as plt\n", "from collections import Counter\n", "import requests\n", "import seaborn as sns\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": { "id": "0U8D2UaItXgk" }, "source": [ "## CSV files" ] }, { "cell_type": "markdown", "metadata": { "id": "y5J3nHyOtXgm" }, "source": [ "Remember what we did in the first part of class today, working with that Iris dataset? We used pandas to read a CSV file directly from the web and gave its descriptive statistics." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "gNWce38CtXgp" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal_lengthsepal_widthpetal_lengthpetal_widthspecies
05.13.51.40.2setosa
14.93.01.40.2setosa
24.73.21.30.2setosa
34.63.11.50.2setosa
45.03.61.40.2setosa
..................
1456.73.05.22.3virginica
1466.32.55.01.9virginica
1476.53.05.22.0virginica
1486.23.45.42.3virginica
1495.93.05.11.8virginica
\n", "

150 rows × 5 columns

\n", "
" ], "text/plain": [ " sepal_length sepal_width petal_length petal_width species\n", "0 5.1 3.5 1.4 0.2 setosa\n", "1 4.9 3.0 1.4 0.2 setosa\n", "2 4.7 3.2 1.3 0.2 setosa\n", "3 4.6 3.1 1.5 0.2 setosa\n", "4 5.0 3.6 1.4 0.2 setosa\n", ".. ... ... ... ... ...\n", "145 6.7 3.0 5.2 2.3 virginica\n", "146 6.3 2.5 5.0 1.9 virginica\n", "147 6.5 3.0 5.2 2.0 virginica\n", "148 6.2 3.4 5.4 2.3 virginica\n", "149 5.9 3.0 5.1 1.8 virginica\n", "\n", "[150 rows x 5 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')\n", "iris" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "MEi6H2MEtXgz" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepal_lengthsepal_widthpetal_lengthpetal_width
count150.000000150.000000150.000000150.000000
mean5.8433333.0573333.7580001.199333
std0.8280660.4358661.7652980.762238
min4.3000002.0000001.0000000.100000
25%5.1000002.8000001.6000000.300000
50%5.8000003.0000004.3500001.300000
75%6.4000003.3000005.1000001.800000
max7.9000004.4000006.9000002.500000
\n", "
" ], "text/plain": [ " sepal_length sepal_width petal_length petal_width\n", "count 150.000000 150.000000 150.000000 150.000000\n", "mean 5.843333 3.057333 3.758000 1.199333\n", "std 0.828066 0.435866 1.765298 0.762238\n", "min 4.300000 2.000000 1.000000 0.100000\n", "25% 5.100000 2.800000 1.600000 0.300000\n", "50% 5.800000 3.000000 4.350000 1.300000\n", "75% 6.400000 3.300000 5.100000 1.800000\n", "max 7.900000 4.400000 6.900000 2.500000" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iris.describe()" ] }, { "cell_type": "markdown", "metadata": { "id": "35eH99AetXg8" }, "source": [ "If we want to, we could also plot a histogram:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "_Ufz7J4ktXg-" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAD4CAYAAADxeG0DAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAPw0lEQVR4nO3cf4xld1nH8fdjF+Oygy1kcVwW4mBCGrETkJ3UYhMyY4VUaigkmNAodvmRRQMEZROz8oeSEJL+QcFojFooUiN0gtBK0xakqR0aEiXOlupsXQkIa+m27FIpW6Y2wYHHP+ZMMszO3Hvm/n6W9yuZzL3nxz2f+70znzn3zLknMhNJUl0/Me4AkqT+WOSSVJxFLknFWeSSVJxFLknF7Rnlxvbv358zMzOj3OSPeOqpp9i3b9/Ytt9WlZxQJ6s5B6tKTqiTtVPO48ePP56Zz91x5cwc2dehQ4dynO67776xbr+tKjkz62Q152BVyZlZJ2unnMByduhWD61IUnEWuSQVZ5FLUnEWuSQVZ5FLUnEWuSQVZ5FLUnEWuSQVZ5FLUnEj/Yi+apg5dlfrZY/OrnF4F8t3c+qGawb2WNKPC/fIJak4i1ySirPIJak4i1ySirPIJak4i1ySivP0Q4ndnXI5aJ5yqX65Ry5JxVnkklScRS5JxVnkklRc1yKPiBdExH0RcTIiHoqIdzXT3xsRpyPiwebr1cOPK0naqs1ZK2vA0cx8ICKeBRyPiHuaeR/KzA8ML54kqZuuRZ6ZjwGPNbe/FxEngYPDDiZJaicys/3CETPA/cBlwLuBw8CTwDLre+1PbLPOEeAIwPT09KHFxcW+Q/dqdXWVqampsW2/rXHnXDl9rvWy03vhzNOD2/bswYsH92CbdBvT3TznQdv8nMf92rdVJSfUydop58LCwvHMnNtp3dZFHhFTwBeA92fmbRExDTwOJPA+4EBmvrnTY8zNzeXy8nKr7Q3D0tIS8/PzY9t+W+POudvrkd+4MrjPlQ3rwzHdxnRSPhA07te+rSo5oU7WTjkjomORtzprJSKeAXwa+Hhm3gaQmWcy8weZ+UPgw8Dluw0uSepfm7NWArgZOJmZH9w0/cCmxV4HnBh8PElSN23eE18JvBFYiYgHm2nvAa6LiJeyfmjlFPC2IeSTJHXR5qyVLwKxzay7Bx9HkrRbfrJTkoqzyCWpOItckoqzyCWpOItckoqzyCWpOItckoqzyCWpOItckoob3GXrLmCjvjLe0dk1Dh+7a2hXApR0YXGPXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKK61rkEfGCiLgvIk5GxEMR8a5m+nMi4p6I+Grz/dnDjytJ2qrNHvkacDQzfwG4Anh7RLwYOAbcm5kvAu5t7kuSRqxrkWfmY5n5QHP7e8BJ4CBwLXBLs9gtwGuHlFGS1EFkZvuFI2aA+4HLgIcz85JN857IzPMOr0TEEeAIwPT09KHFxcU+I/dudXWVqampXa+3cvrcENLsbHovnHkaZg9ePNLtbtjN893IOijDes7dXvtRv8abbX7Ovf6MjlqVnFAna6ecCwsLxzNzbqd1Wxd5REwBXwDen5m3RcR32xT5ZnNzc7m8vNxqe8OwtLTE/Pz8rtebOXbX4MN0cHR2jRtX9nDqhmtGut0Nu3m+G1kHZVjPudtrP+rXeLPNz7nXn9FRq5IT6mTtlDMiOhZ5q7NWIuIZwKeBj2fmbc3kMxFxoJl/ADi7m9CSpMFoc9ZKADcDJzPzg5tm3QFc39y+HvjM4ONJkrpp8574SuCNwEpEPNhMew9wA/DJiHgL8DDwm0NJKEnqqGuRZ+YXgdhh9lWDjSNJ2i0/2SlJxVnkklScRS5JxVnkklScRS5JxVnkklScRS5JxVnkklScRS5JxVnkklTc4K4/Kqknmy+he3R2jcMjuqTuuC6TrMFzj1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4L2OriTIzpEu4jvLysNKouUcuScVZ5JJUnEUuScVZ5JJUXNcij4iPRsTZiDixadp7I+J0RDzYfL16uDElSTtps0f+MeDqbaZ/KDNf2nzdPdhYkqS2uhZ5Zt4PfGcEWSRJPYjM7L5QxAxwZ2Ze1tx/L3AYeBJYBo5m5hM7rHsEOAIwPT19aHFxcRC5e7K6usrU1NSu11s5fW4IaXY2vRfOPA2zBy8e6XY37Ob5bmSddOY8Xz8/X73+Lo1Dlaydci4sLBzPzLmd1u21yKeBx4EE3gccyMw3d3ucubm5XF5e7rq9YVlaWmJ+fn7X6w3rQyo7OTq7xo0rezh1wzUj3e6G3TzfjayTzpzn6+fnq9ffpXGokrVTzojoWOQ9nbWSmWcy8weZ+UPgw8DlvTyOJKl/PRV5RBzYdPd1wImdlpUkDVfX93ARcSswD+yPiEeAPwHmI+KlrB9aOQW8bXgRJUmddC3yzLxum8k3DyGLJKkHfrJTkoqb/H/j/xgb9dkykmpyj1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySirPIJak4i1ySiuta5BHx0Yg4GxEnNk17TkTcExFfbb4/e7gxJUk7abNH/jHg6i3TjgH3ZuaLgHub+5KkMeha5Jl5P/CdLZOvBW5pbt8CvHawsSRJbUVmdl8oYga4MzMva+5/NzMv2TT/iczc9vBKRBwBjgBMT08fWlxcHEDs3qyurjI1NbXr9VZOnxtCmp1N74UzT490kz2rktWc55s9eHHP6/b6uzQOVbJ2yrmwsHA8M+d2WnfP0FI1MvMm4CaAubm5nJ+fH/Ymd7S0tEQv2z987K7Bh+ng6OwaN64M/aUZiCpZzXm+U7813/O6vf4ujUOVrP3k7PWslTMRcQCg+X62x8eRJPWp1yK/A7i+uX098JnBxJEk7Vab0w9vBf4ZuDQiHomItwA3AK+MiK8Cr2zuS5LGoOvBuMy8bodZVw04iySpB36yU5KKs8glqbjJPx+rMTOAUwCPzq6N/FRCSRo298glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKK2zPuAJLGY+bYXT2ve3R2jcN9rH/qhmt6Xrcf/Tznfg3zObtHLknFWeSSVJxFLknF9XWMPCJOAd8DfgCsZebcIEJJktobxD87FzLz8QE8jiSpBx5akaTiIjN7XzniG8ATQAJ/nZk3bbPMEeAIwPT09KHFxcWetrVy+lzPOTdM74UzT/f9MENXJSfUyWrOwaqSEyYn6+zBizvOX11dZWpqatt5CwsLxzsduu63yJ+XmY9GxM8A9wDvzMz7d1p+bm4ul5eXe9rWIM7/PDq7xo0rk3/qfJWcUCerOQerSk6YnKzdziNfWlpifn5+23kR0bHI+zq0kpmPNt/PArcDl/fzeJKk3eu5yCNiX0Q8a+M28CrgxKCCSZLa6ef9xjRwe0RsPM4nMvNzA0klSWqt5yLPzK8DLxlgFklSDzz9UJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqTiLXJKKs8glqbi+ijwiro6Ir0TE1yLi2KBCSZLa67nII+Ii4C+AXwdeDFwXES8eVDBJUjv97JFfDnwtM7+emd8HFoFrBxNLktRWZGZvK0a8Hrg6M9/a3H8j8MuZ+Y4tyx0BjjR3LwW+0nvcvu0HHh/j9tuqkhPqZDXnYFXJCXWydsr5c5n53J1W3NPHRmObaef9VcjMm4Cb+tjOwETEcmbOjTtHN1VyQp2s5hysKjmhTtZ+cvZzaOUR4AWb7j8feLSPx5Mk9aCfIv9X4EUR8cKI+EngDcAdg4klSWqr50MrmbkWEe8A/hG4CPhoZj40sGTDMRGHeFqokhPqZDXnYFXJCXWy9pyz5392SpImg5/slKTiLHJJKu6CLfKIuCgivhwRd24zbz4izkXEg83XH48p46mIWGkyLG8zPyLiz5pLIPx7RLxsQnNOxHg2WS6JiE9FxH9GxMmIePmW+ZMypt1yjn1MI+LSTdt/MCKejIjf37LMpIxnm6xjH9Mmxx9ExEMRcSIibo2In9oyf/djmpkX5BfwbuATwJ3bzJvfbvoYMp4C9neY/2rgs6yfs38F8KUJzTkR49lkuQV4a3P7J4FLJnRMu+WcmDFt8lwEfIv1D6ZM3Hi2zDr2MQUOAt8A9jb3Pwkc7ndML8g98oh4PnAN8JFxZ+nTtcDf5rp/AS6JiAPjDjWpIuKngVcANwNk5vcz87tbFhv7mLbMOWmuAv4rM/97y/Sxj+c2dso6KfYAeyNiD/BMzv/8za7H9IIscuBPgT8EfthhmZdHxL9FxGcj4hdHE+s8CXw+Io43lzLY6iDwzU33H2mmjVq3nDAZ4/nzwLeBv2kOq30kIvZtWWYSxrRNTpiMMd3wBuDWbaZPwnhutVNWGPOYZuZp4APAw8BjwLnM/PyWxXY9phdckUfEbwBnM/N4h8UeYP1t10uAPwf+YRTZtnFlZr6M9StIvj0iXrFlfqvLIIxAt5yTMp57gJcBf5mZvwQ8BWy9vPIkjGmbnJMypsT6B/5eA/z9drO3mTa2c5q7ZB37mEbEs1nf434h8DxgX0T89tbFtlm145hecEUOXAm8JiJOsX5Fxl+NiL/bvEBmPpmZq83tu4FnRMT+UQfNzEeb72eB21m/ouRmE3EZhG45J2U8WR+vRzLzS839T7FemFuXGfeYds05QWMK63/AH8jMM9vMm4Tx3GzHrBMypr8GfCMzv52Z/wfcBvzKlmV2PaYXXJFn5h9l5vMzc4b1t1j/lJk/8hcvIn42IqK5fTnr4/A/o8wZEfsi4lkbt4FXASe2LHYH8DvNf7GvYP1t2GOTlnMSxhMgM78FfDMiLm0mXQX8x5bFxj6mbXJOypg2rmPnQxVjH88tdsw6IWP6MHBFRDyzyXIVcHLLMrse036uflhKRPwuQGb+FfB64PciYg14GnhDNv8uHqFp4Pbm52oP8InM/NyWnHez/h/srwH/C7xpxBnb5pyE8dzwTuDjzVvsrwNvmsAxbZNzIsY0Ip4JvBJ426ZpkziebbKOfUwz80sR8SnWD/OsAV8Gbup3TP2IviQVd8EdWpGkHzcWuSQVZ5FLUnEWuSQVZ5FLUnEWuSQVZ5FLUnH/DyqIn4V4CU/LAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "iris.sepal_length.hist()" ] }, { "cell_type": "markdown", "metadata": { "id": "E0xJWqottXhG" }, "source": [ "Let's say you want to configure that histogram differently, or get axis lables, etc. Use the help menu to see how to do that:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "glv7bjcAtXhI" }, "outputs": [ { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m\n", "\u001b[0miris\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msepal_length\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhist\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mby\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0max\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mgrid\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'bool'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mxlabelsize\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'int | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mxrot\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'float | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mylabelsize\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'int | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0myrot\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'float | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mfigsize\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'tuple[int, int] | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mbins\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'int | Sequence[int]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mbackend\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mlegend\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'bool'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Draw histogram of the input series using matplotlib.\n", "\n", "Parameters\n", "----------\n", "by : object, optional\n", " If passed, then used to form histograms for separate groups.\n", "ax : matplotlib axis object\n", " If not passed, uses gca().\n", "grid : bool, default True\n", " Whether to show axis grid lines.\n", "xlabelsize : int, default None\n", " If specified changes the x-axis label size.\n", "xrot : float, default None\n", " Rotation of x axis labels.\n", "ylabelsize : int, default None\n", " If specified changes the y-axis label size.\n", "yrot : float, default None\n", " Rotation of y axis labels.\n", "figsize : tuple, default None\n", " Figure size in inches by default.\n", "bins : int or sequence, default 10\n", " Number of histogram bins to be used. If an integer is given, bins + 1\n", " bin edges are calculated and returned. If bins is a sequence, gives\n", " bin edges, including left edge of first bin and right edge of last\n", " bin. In this case, bins is returned unmodified.\n", "backend : str, default None\n", " Backend to use instead of the backend specified in the option\n", " ``plotting.backend``. For instance, 'matplotlib'. Alternatively, to\n", " specify the ``plotting.backend`` for the whole session, set\n", " ``pd.options.plotting.backend``.\n", "\n", " .. versionadded:: 1.0.0\n", "\n", "legend : bool, default False\n", " Whether to show the legend.\n", "\n", " .. versionadded:: 1.1.0\n", "\n", "**kwargs\n", " To be passed to the actual plotting function.\n", "\n", "Returns\n", "-------\n", "matplotlib.AxesSubplot\n", " A histogram plot.\n", "\n", "See Also\n", "--------\n", "matplotlib.axes.Axes.hist : Plot a histogram using matplotlib.\n", "\u001b[0;31mFile:\u001b[0m ~/opt/anaconda3/envs/dj21/lib/python3.7/site-packages/pandas/plotting/_core.py\n", "\u001b[0;31mType:\u001b[0m method\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "iris.sepal_length.hist?\n" ] }, { "cell_type": "markdown", "metadata": { "id": "l-mU_ShXtXhU" }, "source": [ "## Downloading data\n", "\n", "Probably, if you really want to analyze a dataset, you want to store it locally (=on your computer). Let's download a file with some stock exchange ratings: https://raw.githubusercontent.com/damian0604/bdaca/master/ipynb/stock.csv\n", "\n", "Download it (file-save as or right-clicking) as \"all file types\" (or .csv); be sure that the extension is correct. Be sure to save it IN THE SAME FOLDER as this jupyter notebook. (Otherwise jupyter won't find it.)" ] }, { "cell_type": "markdown", "metadata": { "id": "ybUlt435tXhU" }, "source": [ "## Note!!! Not all CSV files are the same...\n", "\n", "CSV stands for Comma Seperated Value, which indicates that it consists of values (columns) seperated by commas. Just open a CSV file in an editor like Notepad or TextEdit instead of in Excel to understand what we mean.\n", "\n", "Unfortunately, there are many different dialects. For instance, sometimes, a semicolon or a tab is used instead of a comma; sometimes, the first line of a CSV file contains column headers, sometimes not) You can indicate these type of details yourself if pandas doesn't guess correctly.\n", "\n", "Pay special attention when opening a CSV file with Excel: Excel changes the formatting! For instance, it can happen that you open a file that uses commas as seperators in Excel, and when you save it, it suddenly uses semicolons instead. \n", "\n", "Another reason not to open your files in Excel first: Excel often creates a strange 'encoding' of the characters that causes problems here. This is why we work just with the raw .csv file if possible. If you are getting an encoding error, the first step is to re-download the data and do NOT open it in excel (even by mistake, by double-clicking on it).\n", "\n", "\n", "We can then open it in the same way as we did before by providing its filename:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "zLFfRsU8tXhW" }, "outputs": [], "source": [ "# stockdata = pd.read_csv('stock.csv') # if you downloaded and saved it locally\n", "stockdata = pd.read_csv('https://raw.githubusercontent.com/damian0604/bdaca/master/ipynb/stock.csv') # when reading directly from source (online)" ] }, { "cell_type": "markdown", "metadata": { "id": "YGIPns1ftXhe" }, "source": [ "Let's have a look..." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "OLIdGg-rtXhg" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateOpenHighLowCloseVolumeAdj Close
02016-01-0121.09521.09521.09521.095020.6742
12015-12-3121.13521.14020.91521.095168070020.6742
22015-12-3021.33021.37521.11021.185547680020.7624
32015-12-2921.00021.41020.99521.325626160020.8996
42015-12-2821.44021.45020.70520.880516840020.4635
........................
5182014-01-0725.81026.06525.72026.030533020022.9539
5192014-01-0626.01526.03025.79025.810477530022.7599
5202014-01-0326.00026.19525.87526.065429810022.9848
5212014-01-0225.98526.13025.85525.920511350022.8569
5222014-01-0125.90525.90525.90525.905022.8437
\n", "

523 rows × 7 columns

\n", "
" ], "text/plain": [ " Date Open High Low Close Volume Adj Close\n", "0 2016-01-01 21.095 21.095 21.095 21.095 0 20.6742\n", "1 2015-12-31 21.135 21.140 20.915 21.095 1680700 20.6742\n", "2 2015-12-30 21.330 21.375 21.110 21.185 5476800 20.7624\n", "3 2015-12-29 21.000 21.410 20.995 21.325 6261600 20.8996\n", "4 2015-12-28 21.440 21.450 20.705 20.880 5168400 20.4635\n", ".. ... ... ... ... ... ... ...\n", "518 2014-01-07 25.810 26.065 25.720 26.030 5330200 22.9539\n", "519 2014-01-06 26.015 26.030 25.790 25.810 4775300 22.7599\n", "520 2014-01-03 26.000 26.195 25.875 26.065 4298100 22.9848\n", "521 2014-01-02 25.985 26.130 25.855 25.920 5113500 22.8569\n", "522 2014-01-01 25.905 25.905 25.905 25.905 0 22.8437\n", "\n", "[523 rows x 7 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stockdata" ] }, { "cell_type": "markdown", "metadata": { "id": "fbEIwmyGtXhm" }, "source": [ "The lefthand column here--called the index--gives you numbers in this case; these are simply the case numbers for each 'row' in the dataset; they may or may not have any meaning on their own, depending on the dataset. You can also - later in this notebook and later in subsequent weeks - learn how to change these numbers or assign a different column to be the index." ] }, { "cell_type": "markdown", "metadata": { "id": "Uy7hCeWWtXho" }, "source": [ "Because this data seems to be ordered by date in some way, it might be interesting to explore it by making a plot. In this case the plot is different than the histogram; it's not about frequencies of specific values, but rather a plot of all the cases at their value of 'low'.\n", "\n", "We are using a method here called 'plot', provided by pandas." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "9PMOtl_jtXhp" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAABEVklEQVR4nO2deZicZZmv77f23vctSWffEwKBAAFkVxFQcddREccF98HR8RzUMx4dz6jjwoxeM+PIyMyIwKCO4IKgIkIEgYSQHZKQfe1O72vtVe/541vqq+7q7uru6qWqn/u6cqX6q6+q3q9T+dVTv/dZlNYaQRAEIf9wzfQCBEEQhIkhAi4IgpCniIALgiDkKSLggiAIeYoIuCAIQp7imc4Xq62t1YsXL57OlxQEQch7XnzxxQ6tdd3Q49Mq4IsXL2b79u3T+ZKCIAh5j1LqRKbjYqEIgiDkKSLggiAIeYoIuCAIQp4iAi4IgpCniIALgiDkKSLggiAIeYoIuCAIQp4iAi4IBcIzhzo43NY/08sQphERcEEoAJ490sF779nKq+/6E//w2wPjfvzxjsEpWJUw1YiAC0IecaR9gJu++zStveG04ztOdNu3v//UET794E6yHdbymz0tXPPtp3jyYFtO1ypMPSLggpBHfPWRl3m5pY+nD7WnHe8cjKb9/ItdZ3m5pS+r59x1yhD/g6397DzZzU3ffZquwSgHW/v521/sI5GUqV2zFRFwQcgjdp3qAWBocN01GGVeRYBvvnUD275wPS4Fv3vp3JjP98DWkxwz7ROPS/GZn+7m5ZY+njzQxofufYEfP3+C093BXF+GkCOmtZmVIAiTYyAcB4ZH3F2DURoqArzj4mYAVjWWs+d0z6jP1d4f4QsP7007dqYnBMAje87SOWC8RsdAhEU1JblYvpBjJAIXhDxhIBInbtoZ3cF0Ae8YiFJT4rN/XtNYxv4xLJQDren3P7D1JNF4kuoSH08ebCcYTQBwri+Si+ULU4AIuCDkCc6NSys6tugajFDtFPCmcs71RegaEqk7GSrwR00r5Z//YmPa8XN96Rum4ViCYx2DaZuknQMRIvFEllci5AoRcEGYxSSTmpfO9gLpAv70oXbCMUMwtdZ0DUapLvHb9y9vKAWMrJVMaK35w/7MWSebFlen/Tw0Av+r/97Jtd9+inufM1pUn+wMctH/+wPf/O3B8VyakANEwAVhFvPEgTZu/t4zHDrXz3vv2Wofb+uP8NPtpwBo6Q0TS2gWVBXZ9y8xPetjI+R3H+0YZNuxrmHH77ltEz5PuiycNX1xgD8f7uD3Lxubo9uOG49/YNtJAH7/cuu4r0+YHGMKuFIqoJTappTarZR6SSn1FfN4tVLqcaXUIfPvqqlfriDMLawMkD8f7gDA7VJ87c3nAUbkC3C4zYiyl9WV2o9bUFWEx6U40ZlZwNvMqPqK5TVpx9fNq0j7ecOCirTN0G///iDzK4u4dlUdL52xvhkYAn+qK8Qvdp4Z/0UKEyabCDwCXKe1Ph+4AHidUmozcCfwhNZ6BfCE+bMgCDnE8rqfPdIJwL++50LefelCmquLaB+I8NyRTr7+mFF5ubw+JeAet4sFVUUc7whyuK2fj9//Ij9/8bR9f2/IeN63X9TMhQsr7eO1pSkfHeDm85o43hmkrT9s5on38IFXLeGiRVUc7wxypH0gzWL59E925fT6hdEZU8C1gWWkec0/GrgF+JF5/EfAm6ZigYIwl+kcNMTxOVPAG8sDADSVF9HSG+YzP93F/pY+akv9w8R3ZUMZu0718I4fPM+je1v52Yun7Pt6QzEALllSzUMfv8I+7nEbkvDIp17Fd991AVetNObo/mz7abt46A0bmnjrRQvwe1xc/50tbD3WyZUragFYWF2c89+BMDJZeeBKKbdSahfQBjyutd4KNGitWwDMv+tHeOztSqntSqnt7e3tmU4RBGEEOswIvD9i5H83VpgCXhngYGs/Lb1hblzfyE8+shmlVNpjL1lSzZmekJ2J4sxI6QkaAl5R5AXgokVVad73+vkV3HLBfNY0lXPVyjoe2HqSPad7mVcRoL48QFNFEf/w1g0AJLVh39y6eRH94dhU/BqEEchKwLXWCa31BcAC4BKl1PpsX0BrfbfWepPWelNdXd0ElykIc5POgZQ94VJQW2pkmjRWBOwo+r2bF6X53xaXLkn525sWVaVlsfSEYnjdimKfG4D/+ehlHPi712Vcw2vW1HOmJ8RTB9tYPz/lkd90XpN9u6E8QFWJj55QTErvp5FxZaForXuAp4DXAeeUUk0A5t/SCUcQcoyz4rLI68btMqLsJtNKAVjRMFy8AdbPL7dvX768lr5wnJBZnNMTjFFR5LOjdqUULpfK+DyXLTM+CPrCcTYsSAm4M2Jf3VhGdbEXrVP2jDD1ZJOFUqeUqjRvFwGvBg4AvwJuM0+7DfjlFK1REOYsPcEY8yuN9MCljii7sSKVMljryP92opTiW2/bwNsvWsAi05tu7QvTORChYyBCZbE3qzUsqyulrsx4DWcEDrDR3AC9amUdVWYh0WjFQ0JuyaYXShPwI6WUG0Pwf6q1fkQp9RzwU6XUB4GTwNuncJ2CMCcJRuO88fx5HOsY5MtvXGsfb6pIReAjRc4Ab9/UzNs3NfPsESMN8VRXkPf9xzbA8L2zQSnF5ctq+OWus5w3RMB//MFLiSeSuF3KrgQ90NrHoppivG4pM5lqxhRwrfUeYGOG453A9VOxKEEQIJZIEkto6sv8fPVN6dtOTZWBER6VmTWNhp2y82SPfeyWC+Zl/fjbr1rK2qZyakrTo/1Sf0pCrCj9kw/s5MNXLuGLN69FmFrkI1IQZilWM6kic6PRyUi2yUhUlfhYUFXEH82hDR+5einvu2xx1o9fN6+Cj1y9bNRzVtaX2bcf3StVmdOBCLggzFJCowi4y6X46NXL+I/3b8r6+c5vrmS32U9889Ka0U+eAC5HlehgNJ71RCBh4oiAC8IsJRg1cr+LMwg4wJ03rua61Q1ZP98N6xrt2/Vl44vgs+Xdly7kK29cR08wZuewC1OHCLggzFJsC8Wbm7krr16TqrVrKB+fhz4eLC+8Y0D6iE81IuCCMEsJme1iR4rAx0uxz8ON640ovLrYN8bZE8cqNhras1zIPTJSTRBmKVYEnisBB/jnd1/IQDg+aurhZKkxe7JIBD71SAQuCLOUkOmBZ9rEnChul6IiywKeiWJF4CLgU48IuJBTugajdvaEMDlSEXh+fVEuD3jwuV2yiTkNiIALOeXCrz7OG/75mZleRkFgtZDNpYUyHSilqCn10dYfHvtkYVKIgAs5I2l2oTvcNmDfFibGy2f7+Jk5gCGXFsp0saqxjH3mxJ5M9IdjfPTHL3KgtW/Ec4SxEQEXckZXMPWV+eC5/hlcSf5jzbsEKPbmn4BfvLiaV84N8OjeFqLx5LD7/+G3B/jtS608vENGsE0GEXAhZ5zrS31lvvG7T/PKFIl4fzjGZ366q6C73hlTdnz8/ZvX21Ny8omrV9bhdSs+fv8O/ufF0/x691l+u88orz/Q2scDW41ByEmp1pwU+ffOEGYl/eEYN38v3fv+b3Naea55cNspHtpxhn/bcsQ+tvtUD08dLJyW9C29YS5fVst7Ll0000uZEOvnV7DzS6+lqtjL7lM9fO+JQ3zt0f0A/HZfKxqjEVZbfypT5UBrH7tP9TBoTh8Sxia/treFWcuD204NO5bLar++cAyPS1Hs85AwozZr8supriC3/MufATj+jZtz9pozRTKpaekNpU28yUdK/R7WzavgpZZeTneHCMUSnOgc5E+vtLNhQSUel6LNHIgcjiV43T89DcA7Ni3gm287fyaXnjdIBC7khK3HugBYUV/Knz53LUDOIqk9p3vY8OXf86Efbc94/5OOyLsQGii1D0SIJTTzxtkydjaybl45+8702VWlj+xpYdepHq5eUUt9md/OVGl3ROLbT3TPyFrzERFwYdL0hWMcaR/g2lV1PP6Zq1lYU0x5wEN/ODcC/qyZTvfskU6i8aSdZ275p1sOpoZlRxwbZk8faudYx2BO1jCdvOVfnwWgyTF1J19ZVFOS9vO3fneQpIarV9UZAt4XQWtNu1n0s2FBBcc7BqWWIEtEwIVJc9FXH+dYxyBNlSnBKQt4cybg+1tSqWYHW/vpNrNd7n/+JMc6Bnn6UAceszTcKn5p6Q1x6z3buM2cPpNPnOkJAXD5sty3fJ1umqtT7wnreuZXFnH+gkrWziunPxLnv7ed4tnDxsSgq1bUkdRwqE2ymLJBPHBh0sQSRiTsHLRbFvDQH87NcNv9LX2sn1/O/pZ+vvvEIUIx44Mhmkhy7befAuBdFzfz4AunCEbjVJf4+M2eFgBOdgVzsobpIp5IohR86roVlPjz/7/nQnMWJ8Dd79vED7Yc4abzmvC4XVy7yuiO+IWH99rnXLioEoDT3SE2LKiczqXmJRKBCzkjHE997S31exjIkQfe1h/hwoVVfP7G1fxh/zn+fLgz7X6Xgk2Lq4HUEARn9J9PFYFdwShaQ13p1HULnE7mOb6Vlfo9fPa1q1jTZIx3qy8P8OU3rOWDr1pin2MNTT5rfgsRRkcEXBgXP9t+Ki3f21mk8a6LF9q3SwO5EXCtNYOROKV+Dx+6cimPfOpVw85ZWF1MldmgybJQYonUuqyS9Hygo9+wh2pLp2bgwnTjdbv4369bzc8+elnG+99/xRL+9vWp2Zl1pX6KvG5aesO09Yf59u8OEk8MLwQSDPL/O5owbZzrC/O5/9nDRYuq+PnHLgegJ2QIzldvWUez4+tyWcDLic6x7YvDbQO090e4bAS/NxI3BvtadsL6+RXs/NvXEEsm6Q/Hee0//okVDWV2ublTwANeF8U+D08dbOeWC+ZP/MKz4JE9Z7lyed2kO/1ZHfxqp2hizkzwsWtGn6UJ8MCHLuVI+wBKKZoqA7T0hvj6owd4eOcZLlpUxbWr68d8jrmICLiQNVaqV18o5W33BI3blUMGBJT6s8tCefVdW4CR87etVMSyQOqtWlVivFZ9GXz5jetY1VCGz2N8mbQEMJbQ+D1uNjZXpm2CTgVne0J88oGdXLG8hvs/tHnY+sfjZdsCXiAReLZcvryWy5fXAsYm557TvSyvLwXgeGf+ZRJNF2KhCFnT2mtYJxVFXn6w5Qgf+K8XuO/5EwBUDRHwqmIvPcFompUxlIjDMx9pw9OyYUpGaKl66+ZFXLKk2u7Y96n/3knHQIRoIonX7aKhIpCWYzwVWFkjW492pR1/7kgn6/7v7+wMi2ywPhCrprhn92zmw1cu5UxPiKfM9NC9ozTFmuuIgAtZ09KXEvCvP3aAPx5o497nDAGvHCI4KxpKiSc1J0aJnpzd6kayW6wovjQwehRb5Gj41D0YJRZP4nMr6sv8dA6O/kEyWc50GwIeT+q0QqLnjnam/Z0Ycn8mUoOM5+6X46tW1vG+zakWAs49FyEdEXAha1rMSDNTe9MFVelFJyvqywA42Dow4vMdbkvdd3SEghvLQikdw4Zw9swOxRLEEkm8Hhf1ZUZq41RG4ae7Ux8+VsWhE2t42bXffsq2jEZiMJrA61a2JTRX+ewNq1hSaxQBZepmKBjM7XeJkDXBaJxf7joLkFYl94lrl/HHz149zANfXl+KSzFqR8Ij7SnRPpUhX/uLD+/lnXc/D2Qj4Kn7ByJxYgltWCjlhpfcNoUC7vz20Bdy+P5WtK0MCT/ZFUy75kwEI/E5HX1blAe8/PGzV3Ptqrq06lohHXmnCFmx53Sv7fWeMiPOr96yjlsvW5zx/IDXzaKaktEFvG2A1Y1lnOsL09I7PO/3/q2pboZjbQQGvKlYZDCSsD1wKwKfqq/hnQMRfrO3BbdLkUhqekMxGiuM17QyYkLReNal4cFogpI8HOAwFSil8HvcRGIi4CMxZgSulGpWSj2plNqvlHpJKXWHefwCpdTzSqldSqntSqlLpn65wkxhba7VlPjs/iJjdRtc2VA6ooBrrdl7ppeVDWU0VhTR0pMusEMbYZWN4YErpeyCkMFInFjC8MDnm9bO6e6pKQw52NpPMJrgw1cuBYy+MBbWgIvuYCzrTIpgNEFxAVRg5gq/15W22S2kk42FEgc+q7VeA2wGPqGUWgt8E/iK1voC4Evmz0KB0mOK0bzKIrt0vm6MXOWVDWUc7wwSiSf4w8vn7OcAONI+QFt/hMuX1dBUEaClN13Ah2YeZJOKd/tVhogOmALudbuoKvZSFvCMupnqpC8cG1cLAMvztj4oes0Pup5glIfMaTM9wWiazTLahupgNC4RuAO/xyUWyiiMKeBa6xat9Q7zdj+wH5gPaKDcPK0CODtVixRmnh4z97upwtnvZPRUt+X1pSSSmr2ne/nQvdu5/d4X7fteNFuGXrrUEvD0CHn3qR4Avv+eC/nEtcuyEjVL5AcjcWJxwwNXSrG4poTjWRQV9QSjbP7aE2z+2hN0DmTnmYfNr/cN5oeZFYE/4Bhm0R2MpX2jcObRDyUYSeTlDMypwudxySbmKIxrE1MptRjYCGwFPg18Syl1Cvg28PkRHnO7abFsb29vz3SKkAd0B6P43C5qHAUmY20sWpWZ1nxMZ1Td2msI5PzKIpoqAnQHY4QdGRy7T/fQXF3Ejec18bkbVqOUYiys2ZGDkbjhgZuZHItrSzieRVvZLa+0E4wmGIwm7PTIsbDWXG/aSb2mODuzBbuD0bQ+MT2jCLgRgYuFYuH3uCUCH4WsBVwpVQr8HPi01roP+Bjw11rrZuCvgXsyPU5rfbfWepPWelNdXV0u1izMAL3BGJXF3rRIuNg/eqTYXGUIuFUJ6Uyxa+sPU13iw+dI9etwRL1H2gZZ1VDOeHC5FCU+NwORhO2BG+so4mxPiGRy9BzsZw93UlnsZUltSdbtTC1htuwkS8BD0QQuBe/dvJCeYCxtE7N3FAEPiQeehmGhiAc+ElkJuFLKiyHe92utHzIP3wZYt38GyCZmAdMdjFJV7EvLtx4rUqwt9VHkdXOgZbgYtvVHqDdFr7bMSEF05mr3hmITqkYs8XvsTUyPy3h7N1UEiCc1HYOj2yKtfWEWVRfTUO7POm/cslBKfR5K/R5bnK1IuqrYR08wmibgbaNkxIgHno7f4yaW0Pb4PCGdbLJQFEZ0vV9rfZfjrrPA1ebt64BDuV+eMFMMRuL825YjdvTTHYxRUexNiw7drtFtDaUUC6qKONiaEvDBSJzBSJy2/ogdtdaVDi+26Q3FqCgav4CX+j0MRM08cNNCaTQn27T2jp5K2BOMUlnso64s+/J7y0IJ+Fw0Vxez40Q3feEYwUiCYr+bymIfSY09cSbgdfGt3x0csSJTPPB0/GZ6qPjgmckmAr8CuBW4zkwZ3KWUugn4MPAdpdRu4GvA7VO4TmGaue/5E3zjsQP82PSCz3SHaKoIjDsqri7x0e/YwLvjwV2s+7+/Y/epHrxu4+1nReAdA0aWSjSeJBRLUD4BAbci8Gg8ide0UKyN16GZLkPpNm2iutLxROAJlAKf28VFiyrZfbqXd//78wRjCYp9Hvv3dbYnTLHPzf+5eS1H2gfZ8kr7sHa7WmvxwIfgNz+ExUbJTDZZKM9orZXWeoPW+gLzz6Pm8Yu01udrrS/VWr841nMJ+YMlYM8f7aIvHONMT4hVjWXUlIyvS95QEf7D/nP27YsWVQHYz2m9ppXGN5EIvMTvduSBWxG4IeAjReC7TvXwF3c/T0tviKpiH/XlfgajiayGModjCQIeN0ope4r8vjN9ZkWl227y1dIbIuB1c/N5Tbhdivf/5wu86+7n0p6rJxgjqVPdFgXslgKykZkZ+agXMnLAtD32t/TZFsjqxrJhJfNjUZ4h1fA7bz+f69fU22mIPo+Rr90xEOFYxyCP7jXGoZUXjf/tWer3cKYnbOeBA1QX+/C61YgR+FcfedlOa7QicDB8+iVjbCiGY0m7CvTyZbW859KF/Gr3WTuStpp8tfaGKfK6qSrxmdcaZd+Z9Da3Vrl/fQH1Ap8sfo9hJ0k1ZmakF4qQEavR1NnekN01cGVDGbXjjsBTAvjYHVdy4/pGXr22gcpiX5qHXmvaFu/+9+f51u8OAhONwK1NTG0LuMulqC31p2W5OHG+TlWxz+5D/fLZsfuIh2IJAo5OiE0VAfrDcXqCMYr9qQi8czBq+7kfv2Y5AGVDPhys0W8i4CksCyWaEAslEyLgwjDiiSRt/WEWVBWhNXzl1y/j87iYV1FE9ThnNToj8FUNZXz/vRdlFOa6MkNgnUMgMkXvY2F74IkkXk/6B8RIAu5M66ss9rJ2XjlFXjcvHO/KeL6TcCyR1srWGsRwsitIic+T9vuyzvvLKxbzlgvn4xqyCdzWZ0bgY7QomEtYAh6WCDwjIuDCMLa80k5SwyVLqtOOW3nW48HpgQ8VLCd1ZX7aByJpU8wnnIUSiRN3eOBgpDSOtDHpLHOvKPLidbtYP7+cl85mHiTQG4px43ef5kBrH+FYEn8GAQ9GjWySMr/H/p1ZkbpSikXVJfSGYmnZFWKhDMf63YoHnhkRcCGNU11BPvij7QBct7qe5mojBc8SGqsicoVpM4xF+RhNqCwsC8US7RKfm4aK8UeiJT4PkXiSpMa2UCAV4Q8lmdR0DET41HXLeeDDl3LVijp7Pd3BzAU3W15pZ39LH9974hCReCKtE6JzlmWJz23OeDR+h85IvcaMzLsd/WHa+sOU+NzjGsFW6EgWyuiIgAtp/GLnGfv24poSnvzsNQDcdF6jffyFL76aX3ziiqyeL9tUwLoyP8Fogrb+MFevrGPnl147QQslJZLetAjcT+dAdFg1phXZFfs8XL6s1v6WUFHkHbFniZ377XHbWSgW8xwfOkVmOqCVxpgm9KXpmTcAxzoGWVRTku2lzgnsLBSxUDIiAi6kYY3/Aqgv9+Nxu9j2xeu56x0X2MfryvxZR4luM2K/YnnmqfMWlqAd7wxSFvBMeCKNsz+LlQduPX88qYf1IbHK+4u86a9XUeQdseQ9Yj7G73Wbm5jpkb6F1aEwJeApoV9eX4pScM8zx+xjh84NsKIhu282c4V5ZhGW1U9Hay3RuAP5ribYaK3Zd6aXzUuruXxZrZ1OZ/UqmQibl9Xwlo3zufPG1aOeZ03OSST1mL2/R8P5weKM4J2WRbUjz9oW8CHefnmRl0g8aUTY3vT7UqLvJhxLpj3W2XRr3Tyjl4vVE8Y5dWh5fSk3rW9ix0kjfXEgEudMT4i/qG8e7yUXNI0VAdbPL+cbjx3g+08dIRRLEI0n+dEHLuHqldJbSSJwweZ0d4i+cJw3nj+fv7p+RVYdAMei1O/hrndeMGZmxfzK1EzNsbocjvV6Fs6CGMtb7xnia1s9SoaKtGX9ZLJRLG/cpRhmoThZ02gI+HvMAb2vXtOQdn9ZwGPbMdZgZLFQhvOp61awaVFV2qbvPplUD0gELjg4aUaI1jDZ6WRemoCP3/u2H+uI3p1l/1YBUm8omnZ+2BFNO7EEvzcUS/vwOdkZ5PtPHQEgGEsMy0IBePD2zew82WNH5tUlPg7//Y143OnxUsDrtj9ArKh+Mh9ehcoN6xq5YV0jdz3+Ct97wmi5lGkE31xE3i2CjbWhVl8+/WlsAa8bj0sRn6SF0ugQW2fVaOVIEfgIFopTwJ38/uXW1GOjCdNiSRfmzUtr2Lw03fMfKt5gXLOV3xyMxu1jQmY+fs0yGsr9/OjZ42M2JpsriIUi2FgCPtaotKnCyn8unYyAO7JAnF63VRFp2R9PHmhj8Z2/sed7Do3ArfTHoQLutJWC0XhGjzxbirxuookkiaROfROQToQjEvC6ec+li1hQVczZHhFwEAEXHLQPRPB7XMNKvKeLz71uFe/c1Mx1q+sn/BzO1EFnIVBZwINS0GvmXX/1Ny8DqXL5oSJsPbZvyHzM3mAUpeD8BRX0h+PEk3qY+GdLkc+qMkwQihqR+ESfay7RVBHgdHcwbYLTXEUEXLBpN3t052LzciK8eeMC/uFtG+yUwsni7LXicikqirx2GuHRdiPytqyLoZGv5UUPRNJFosfsU17i99A1aHwYDLVQssUS61AskZbZIozOa9Y20BeO84MtR2d6KTOOCLhg0+4YspDPLKopzni8ssg7rLqyc8AS4XThdA5IdtITjFFZ5KXY53YI+MRE13pcKJoS8IBP/kuOxTWr6rlwYSV/OiQzduXdIti09IZomETO92zhsTuuZNeXXjPseGnAQzAST6vG7DRFeGjkW+xzo1QGAQ/FqCj2Uezz2GXwI6URjoUl4OFYgnBUIvDxcMmSGvac7rGzeBJJzZMH20acdFSoSBaKABg9QU53hyblP88Win0eMrUttyacD0RTotxpzskcKpxKKUp8nmFTc3rNsWvFPmNWI6TGfo2XIlvAk6kIXAQ8K9bNKyeW0Lzc0sd9z5/gaPsAu0/38p/vv5hrJ/geTiY1X3h4L0vrSrj9qmU5XvHUIBH4LOW5I50s/fxvRmyBmmvaByJE4sm0boCFhjXhfMDRstayUPwZSvdLzda0TnpCxtg1p2c+8U3MdA/c61Zpm7DCyFj7JF/59Us8vPMMu08bhT1nJ5Ef/tDOMzz4wim+8diBnKxxOpB3yyzlP/58jKSGbcfG7kmdC6wy7wUFLuDReDKt53gwauRxZ2p1W+J3D4vAnR64xaQ98FiCUHTi6YhzkVqzNcKe071pVbzO1sDjIZHUPLDVmP/anEf/B0TAZylWKl9/OHNDpWzpCUb5f4+8TPdgdNTz7t96EoAlBVzKbVkoQ3+nxSMMETZ6i6eyUJJJTV845YFbTFzAjf9+VkGQ+N/ZU+PIVFrTVM4fP3s1taU+O7tovPzw6aPsONkDQEeWA61nAyLgs5Risy3qZAsWtrzSzg+fOcaH7t0+4jlaa361+yyvWdvA4hkoo58ufB6XKeDpUfVIlZ8lQyyU/nAcrY0c8fQIfHIeeCRuWChSxJM9lUVerC9NjRV+ltaVcunSGv6w/xzXfOtJHtx2cszn2Hemly88vJdkUvPC8W4WVBXxuRtWMRhN2Jujsx0R8FnI/VtPcN/zxhvQ2cFuIlge7yvmYOJMBKMJEknNJnNKfKHi97iIxBL0RyYm4D1mH5VcWSi2B24KhkTg2WNMh7L6rRsWysXm+/d4Z5D7TDtkNO54cCcPbD3J0Y4BjrYPcN78CjuNNtu9p3ufO87133lqxrJfRMBnIV98eJ99+0jHxL4SWpwzB+WG44lhwwwsrGrDbIcv5Ct+ryvNQrEmxpeN0DzLGs9mYfVRMTYxU6I/UeGtLPIR8LrYebJn2HBkYWwSpmg2mP1vLnaMANx3po9zfaN/e7VssP969jhHOwZZXl9qt1Buy9JG+dIvX+JI+2DW5+caEfBZiLOU/ZXWfhIjCG82WINyYwlNx2DmN1lfyBCpiUzAySdSHrhxvVbO+8gRuHtIBJ4S8JIcReBvumA+v9h1hr5QTCLwcXLXOy7gwoWVXLLYEO518yr44fs2cf+HLgVg96meUR9vtUuwvu1uWlxtjxDc39KX1Ro8po8zU+1tRcBnIZUlKSENxRKc6Jx4FO6MQkbq4JaKwAu7LMBKIzzRGaSiyGt3XRypeZY1sMGixyzcqSjypfnVlcUT/+C7ckUdkXiS3ad7xQMfJ69b38hDH7+ChY7K21evbeDChVW41Nii2uNoLXzTeY1cvbKOZXWlLKsr4ZE9Z7Nag/Vvv1cEXLAYGokdahuY8HOd6wvbUcVIG6LW0IK5EIHHEpr9LX2saiyzW8+OdN1GxJ6w/c1eRwTuzEKZTO72efMr7NsSgeeGIp+b5fWl7DubOYp+aMdpfruvxf52CikfXSnFZctqODDKnpFFMqntb69PH+qwjwejcWKJ6ZnhOeY7TynVrJR6Uim1Xyn1klLqDsd9n1JKHTSPf3Nqlzp36AnGOH9BBb//66sAaBvBy2vvj3BwjDdabyjOhvmVABxuS537w6ePsvjO3xCOJeaUBw7wcksfqxvL7CESI1VSBrwukhripoXVPWj8niqK0i2UydBcXWRbZuKB54718ytGjIo/89PdfPS+HXYbBUjvI19fFqAnGBtz9mbnYJRoIklVsZcdJ7t59nAHv93XwoVffZz3/HBrbi5kDLIJHeLAZ7XWa4DNwCeUUmuVUtcCtwAbtNbrgG9P4TrnDFpruoNRLltWy7K6UlwqfXK5xb88eZiL//4P3PBPfxo15WkwEqepIsDSuhJ2OTzBfzM7uXUMRBweeOFbKADReJJVjWX2xJ6BIWmFqfNTvUoAugYjlAc8eN2unNkdSinbMiuSRlY547z5FbT3R4ZtZDo38hNJbWcTWWm7kOqHb2VwjYQ1Fegdm5rRGt79w6189L4dhGNJth3rGrP2IheM+Y7RWrdorXeYt/uB/cB84GPAN7TWEfO+tqlc6FxhIBInltBUl3hxuxTVJX6+98fD/MuTh9NSlb71u4P27cf2tWR8rkRSE4olKPF72Nhcxa5TqYjE2nzpHIjaFkrZHLBQLFY3llFRbPX8HkHAzcg8Ys5h7ArG7CERIxX/TAQrHU4slNyx3rSmXh6yGTl0I/8vr1gMpPeOtwaLZAqcnFiW5GvXNWZsxbDt+NRXUY/rI18ptRjYCGwFVgJXKqW2KqW2KKUuHuExtyultiultre3S/vHsbBS1awJMlYWxLd+d5AOR0RQVexlUU0xzdVF/OSFUxmfy+p1Xer3sKS2mI6BiB1NetymgA9GaB+IUOr34MvwJiwknP/JVjaU2WPP3rFpwajn2wI+GHEIeO7E1nouEfDcYdljQzfureHRAN982wY+/eqV3PWO87lpfZN93IrAx0oNtCLwRTXFrGwoG3b/3tNTv7GZ9f9YpVQp8HPg01rrPoxOhlUYtsrngJ+qDJMAtNZ3a603aa031dXV5WjZhYvVY9oSipBj6oj1dTCZ1PSGYrxhwzzesnEBW491ZSy5HzTLwEv8Hrv5j+X7WcMOfrHzLPc+d4K1TeVTdEWzB+sDan5lEWUBL00VRRz/xs1cuSLz+9LypCPmv0HnQJTqEuP3aIn7Wy6cP+l1WfMyA5KFkjPsfO6+dBE+02OI7mN3XMk7NjXjdbt4y4UL0nrhWAL+uf/ZTXyUzcjW3jA+j4uaEh8l/vR/u2V1JbxgRuC9wRgfu+9FjrRPPBlhJLIScKWUF0O879daP2QePg08pA22AUmgNucrnGN0malqVaaAX+IoTmgzi3J6QzGS2hD5+VVGpDF0WC9gF6GU+N22gFt9HiwB/9VuI12qoSL/+4CPhSW6qxqHR0ujnW+lEnYNRqkx/12UUuz98mv55ls3THpdbjPukQg8d/g8LqpLfHYhm8XB1n5cChaP0vOnrtTP6sYyeoIxu0eQE601yaTmbG+YpooASin73+7KFbUc/dpNvOVCI7D6+qP7Of/vfs9j+1pH3GuZDNlkoSjgHmC/1voux12/AK4zz1kJ+ICOYU8gjAtr46PatFD+6y8v5teffBUArb2G+FpRdE2pz562PnT4LqTsl1K/h9ohJcKRWHpk8baLMtsIhYTf/E+WvYCnepVYm8tVjkHJZQFvxmnz48VlPoUIeG6pL/PT1hehNxTj64/tJxxL8NLZPpbXl466Ce1xu3jsjivZsKCCh3aeGXb/vzx5mKVfeJQDLX12+2Xr21rA68blUtx2+WLcLsUP/mQkC3zp9Ws5v7ky59eYzbvvCuBW4Dql1C7zz03AfwBLlVL7gAeB2/RcG4cxBVgWSpVjs2x1UxlKpSwUp81SaQp9pgh80I7APXb7TUvAnYL/kauWcvXKwre3rIh6dbYC7tjE7AsZm8s1JRkmRUwSlxWBi4WSU+rLjeHH//j4K/xgy1F+vuM0L53tZd28ijEfq5Ti+tUN7DndQ+eQvijfe+IwYNRnWM9V5BBwMIImSw6/cNNqPvCqJTm7LidjbqVrrZ8BRppy+97cLkfoDkZxu1RaSp/X7aKmxG9bKF3mTnpVsc8uIskUgQ84I3DTQnl0byvL60vTenw0zQH7BOCC5ko+evUyrl/TkNX5zjTCIx2GfzkV3RotO8s/wdFsQmYW1xTzp1fa7aKcM90hzvVFWNFQmtXjL11ajf6Dkcni3CeJOnzxdfOMvSNr/yLg2Ch/z6WL+PHzJ3jXJQsnfS0jUdiJv3lI12CMqmLfsMnw5UUeu4dHl1lQUlPqs6M3Z1mwxaCZhVLscxPwuqkp8bHllXa2vJKeDVTinxtvg4DXzZ03rs76fGcWitXNcVWGbIPJYnngGWZKCJPgCzetYUV9KX/7y5cA+NenjgCj+99OrPOOdwa5ckXmcywBHxqBA3zpDWv5m9eumtIK58LOG8tDugYjdoGJE5/blZbOBoaFYuWvZt7ENLInSk2Bri9Pj7Q/do0x92/NHMhAmQgBh4Vy8Fw/RV43C6qKxnjU+LE+rJPiQOaUgNedMfpdVJPdxJ36Mj8Br4sTjo6gYUdWWInPbYu8JeDOVFWv22XXGkwVIuCzjLb+iN0e04nf6yZqCnjnYJRSvwe/x4is/R6XXYzjpMvMG7dK5P/ulnVp999+5VJe+soNdtGDkI69iRlLcKY7RHN1UcbRa5PF2gedpvYZcwqv28Ujn3oVv/jEFfaxRVlG4C6XYmF1MSccPfmdlZ1r55Xb7wfrw34q3h+jrnFaX00Yk9beMI0ZPGmrkx4Ym5jVjs20ymIvW15pt9vOtvWHec1dW/jHP7zCwupi+2vdxYur+cd3nm8/rqrEN2fsk4lgbWKG40ljlNoU9Yr52DXLKQ942Ly0euyThXGzfn4FFzRXcvetF/G2ixbY30izYVldKQdaU9WcVnFPfZmfN54/zz5uZSNNdx6HCPgsIp5I0tYfybip6Pc4LZT0dLa3XriAA639bD3aCcBv97XaHQyHpqYtrTU2cG7e0IQwOs4IvD8cnzIv84LmSvZ8+Ya0OY9C7nntuka+/fbzxz7RwUWLqjjVFbIjb+vvH3/wUm69bLF9nhV3T7cLJgI+i+gYiJJI6pEj8AwFJQDv3bwIMDZbAP54oM3udbKsPv3r4oYFFTx4+2a++84LpuISCgrnJmZfOFbw3RqF4Wwyh0VsP94NwDmzsrOhPP3DdngN+vQg359nEWfN3gqZI3C3nb7UNRhN23hsLA/g87g40TmI1podJ7p5+6YFvH7DPNYPyXlVStk9QITRSRPwULzguzUKw1k3r5yA18ULx7u4eUMTbX1G+fxQO02ZMfh0b0PLO3IWYY1xWlI7PE/V8sC11nQOicBdLsWi6mKOdw5yojNIXzjOhgWVXLFcOhtMBqUUfo+LcCxBv0TgcxKv28XG5iq2nzD6mrT2hWko9w9L87V+nO5MIrFQZhHPHemkodzP4gxpTn6vYaEMRhNE48k0DxyMnfUTnUH2nTU6oJ0nmSU5we9x0TkQJakLf2KRkJnLl9Xw0tk+jnUMcqY7xPzK4amklqCLBz5HCUUTPH2ogyuW1Q77dIdUHnj3kG6FFotrjAi8xexR3FyVXa6rMDp+r5t2s5S60GeGCpl558XNeF0ubr93O4faBliQ4f/W6zc00VQR4NbLFk3r2kTAZwmP7WuhNxQbsezWygO3G1kNi8CLCceS7G/pw+tWIjY5IuB12Y39JQKfm9SXB/jKLes41DZAbyiWMQJvKA/w3OevZ1lddmX6uUIE3MQ5vHYmOG02mt+4sDLj/ZYH7qzCdGIVJ2w/0U1NyXCPTpgYfo+bdrMHjXjgcxdn3/epqMadKCLgQOdAhFX/57fc88yxGVtDfzhGkdc94oRzn9sYsGulMdWUpKcxWSW9J7uC1JblvmPeXMXvcdmTkCQCn7v4PW4+cvVSFtUUc9Giqplejo0IONBijl36+Y7hvX+ni/5wnLJR0tSsqkBrrVUl6WIyrzJg537XSkFIznA2JxJbam7z+RvXsOVz17J0mm2S0RABnyWMVShiVQWe6w3jc7uGlQN73C6azebyIuC5w9mcSCJwYbYhAg7EzR4iM+mBjxmBm0LSPhChvMgzYqYKYE8JESaPU8BH+/cRhJlABBwjhW+m6QvHKRslwrMG8rb3R0aMBC1b5c0bJz9oVzCwvvmU+Nw5GZ8mCLlEQgrSe/zOFP3hGM2j7G5bQtIxEKG+LLNF8k/v3MiB1j7bShEmj9UmVDJQhNmICDgQmgUC3hcaPQK3vsp3DERYXp95E6WxIpCxEZYwcawPTvG/hdmIfCdkdlgo/eHYqM2SLP81ltDixU4jfjsCl9+5MPsQAWfmI/BwLEEknhz1a/oChy0i0eD0YX3zkd+5MBsRASezB/7gtpMsvvM3DETi/HZfCx++d/uUvb5Vql03grcNRstYr9vIPJEIfPqw8sCbKsWaEmYfogRktlC++8QhAE52BvnofTsAiMaTdjZILrGKcxozzMK0cLsU9WUBzvSEJBqcRoLmeyNTAyNBmGkkAidloUQdU2Wt3PAzPSH7WH84xqmuINuOdeX09VvNMU1jbUDWlhol8qUSgU8bZ3tGHrIhCDONCDgpAQ87InFrQPDp7tRE6v5wnOu/s4V3/OC5nL7+OTMCzzSN3snfv/k8XrW8lsuXyaCG6cKaMi6pmcJsREI5Uh64JeTJpGYgHAfgaPugfV5/OG5H6b2h3E0pb+0LU+R1jzmya/38Cu770KU5eU0hO77yxnVsXlLNxubKmV6KIAxDInBSHnjIIeSWUDvtkv5wzL7tjMwnS3cwSnWJT1rAzkJqS/3cetli+bcRZiVjCrhSqlkp9aRSar9S6iWl1B1D7v8bpZRWSuXt9/quoCHM4ViSeCLJYDRu33fwXL99+0hHKhq3+nfngnAsQbHPPfaJgiAIDrKJwOPAZ7XWa4DNwCeUUmvBEHfgNcDJqVvi1KK15mVzjiTAQCROMGJE4ouGzKY87BDzU125i8BD0QRFIuCCIIyTMQVca92itd5h3u4H9gNWt6R/BP4XMHNt/CbJbf/5Ah0DUTYsMIYA94fjdgR+wRDf86y52QhwuG0gZ2sIRhNpfacFQRCyYVweuFJqMbAR2KqUeiNwRmu9e4zH3K6U2q6U2t7e3j7xlU4BoWiCP71irOmGdY2AsTlp5f6ev6Ay7fxWU8Ariry83NKXs3WIhSIIwkTIWsCVUqXAz4FPY9gqXwS+NNbjtNZ3a603aa031dXVTXSdU8Ke0z0A/Pv7NtmzKPvDcQYjRgS+bl45HpfC61YU+9y09Bq+98WLqzjY2k/ckTc+GYLRBEUSgQuCME6yEnCllBdDvO/XWj8ELAOWALuVUseBBcAOpVTjVC10Kth7xvC+Ny6stKsb+8OpCLyi2Mvi2hKqS3xUFfvs2YirGsuIxJP0hmKZn3ichGIi4IIgjJ8x88CVkT91D7Bfa30XgNZ6L1DvOOc4sElr3TFF65wSTneHKPV7qCnx2VF3fzhO0pzMU+LzcPHiak53B+kLx+2qzKpioyIymqMIPByTTUxBEMZPNhH4FcCtwHVKqV3mn5umeF1TwvbjXXzl1y/ZP5/uDrGgqgillN2Lu88RgRf73Hz1lnX85/svprbEEG2lsAt4YvHc7N2KhSIIwkQYMwLXWj8DjFrFoLVenKsF5ZqeYJTygBeXS/G2fzNK4L9w0xq8bhenu4PMrzSm4Fgd/k53h7jnmWMAlPg99hitGrMPSYnPYze0iiYm34ZWa21YKBKBC4IwTgq6EjOWSHLJ157gpu89TTKZipYHI3G01pwxI3AAr9tFkddtizekD7StMSe9F/vc9vFoDiLwSDyJ1oiAC4IwbgpawM90h4jGkxxo7ed4Z6qKcjCaIJbQ9Efi1JamenBftqwm7fHO8uka00JJahwR+OQ9cKuMXywUQRDGS0E3szrhqJZ0Ft4MRuK2ZeKMfP/pXRfwx/1tPH+0k61DWsZarV47BiJ4TVsllgsBj4mAC4IwMQpWwPee7uVj971o/3ywNVUGPxCJE4kZ4uu0ScoDXt60cT5v2jifoVy/ugEwmhv53JaFMnkBtzZMxUIRBGG8FKyAf/U3L9viWFfmT4uoByNxu4WsP8vIt8jnZsvnrsGlFO0Dxgi0XFgo1jqklF4QhPFSsB54fzhORZGXb7zlPOZVBDjQmip9H4wYQ4QhPQIfi0U1JTRXF+c0ArcGR3hc0q5UEITxUbACHorGuXplHe+6ZCFVJakqSjAi8Eh84pGvvYmZAwG3Rre5RcAFQRgnBSvgwWiqQVS1WTlpMRiNE87ggWeLL4ebmFbVpwi4IAjjpaAF3NoYrCpJF/CBWRSBWxaKWya+CIIwTgpSwLXWBKNxSnzGHm21Q8CVgmAkkTELJVtymUaYEAtFEIQJUpACHoknSTqqG6scFkp5wMuxzsGcROCRXEbgIuCCIIyTghRwZzMqgMri1PT4t164gN/saeGIOW1+ch745EvpE+KBC4IwQQpUwI3WsJaFsrqxjKpiL3ffehFXLDfK5a3WsDPugSdEwAVBmBgFWcgztLpxaV0pO7/0WgCeO9IJQJeZVjiRCNztUrjUxLsR9gZjRBIJ6ssCdgTukk1MQRDGSYFG4IawlviHR9elfuMzq2vQEPCJVkD6PK4JWyhv/v6fueTvn0BrbXdJ9LhFwAVBGB8FKuCGhVLkHf4Fo9gU9c5Boxx+IhE4GD54thbKkfYBTji6IR41/fc9p3tThTwSgQuCME4KU8Aj6ZuYTpwRuMel7IEN48XncWXdC+X672zh6m89Zf9cZq5h75leu5DHJR64IAjjpDAFPDayhWKJencwNuHoG8YXgQ/F8r1D0YT0QhEEYcIUpoCbA4qLfBksFMexyXQA9HlcI+aBn+wMctnXn+CUox85QG8oxh8PnLOFPxRL2BaKbGIKgjBeClPArTzwDALtdil7eEKFIz98vFQUeekJRjPe99DO07T0hnnwhZNpxz/5wA4+8F/bbdEOxRL2JqakEQqCMF4KUsCtKTfFGSwUMIYVQ2pM2kSoK/OndTh0YpXudw3GbIsEDM87bZ3RhG2niIUiCMJ4KUgBH4zEcbuUXTE5FMsbrynxZ7w/G2pL/bT3RzLeZ1kz3YNRBs2MGICeYCztPKcHLpuYgiCMl4IUcKuVrBrBV7YqNGtKJx6B15b66RqMpEXYFpY33hWMMhCOD7vfIhRLSDdCQRAmTEEKeMjRCzwTpYHJWyi1pT6SGroz+OBh04PvGowyEBlZwIOOCNwthTyCIIyTghTwwWg8LdtkKMvrS4FUW9iJUFtm2C+ZbBTLgw/HEvSPEIE3VQQISwQuCMIkKEgBHysCv2ZlHTC5LBRrys9QXxuGCvjw+wHmVRYZFop0IxQEYYIUpIAbEfjIAv7adY385PbNvPuShRN+jWIzkyUUGx5hW5Pm+8NxOwJf2VCadk5VsY9gVNIIBUGYOGMKuFKqWSn1pFJqv1LqJaXUHebxbymlDiil9iilHlZKVU75arMkFE1kLOJxcunSmgmX0UOqonMwMrwjoSXgkXiSV871oxS8eeOCYY8POwp5xEIRBGG8ZKNgceCzWus1wGbgE0qptcDjwHqt9QbgFeDzU7fM8RGMJigZJQLPBVYxUCg6XMCdx7Ye7WJJTQnXrq4b9viQIwKXNEJBEMbLmAKutW7RWu8wb/cD+4H5Wuvfa60t/+B5YMFIzzHdOAcaTxVWBB6MDrdQLA8cYNvxLtbMK2dlfRnvv3yxfbzI5yYYjRNPainiEQRhQoxroINSajGwEdg65K4PAD8Z4TG3A7cDLFw4cc95PDgHGk8VVpZLMJbJQknvkbJuXjkul+LLb1zH1mNdXNBcQZHPTX8kzosnuiX6FgRhQmStckqpUuDnwKe11n2O41/EsFnuz/Q4rfXdwN0AmzZtmvwQySwIjpGFkgsCXhdKjWChDBH1tU3l9u3H7rgSgH//01G0hq3HuqZ0nYIgFC5Z7eIppbwY4n2/1vohx/HbgNcD79FaT4s4j0UiqYnEk1NuoSilKPa67cZZTsKxBLWOKs918yqGnVNbNvEiIkEQBMguC0UB9wD7tdZ3OY6/DvjfwBu11sGRHj/dDB1oPJUU+TzDBPzls33sOd2b1melrmx4z5XJ9GERBEGA7CyUK4Bbgb1KqV3msS8A3wP8wONmz5HntdYfnYpFjoehA42nkmJzI9LJkwfbAHjD+U0E9rv5q+uWZ3xsbakIuCAIk2NMAddaPwNk2mV7NPfLmTyjDTTONYaAp0fgvSFj0s8nr1vBJ69bMeJjxUIRBGGyFFwl5mgDjXNNsc89bBOzNxijMosSfasUXxAEYaIUoICPPNA41xT7PMMslN5QjIqisQXc43YR8Bbcr18QhGkk7xTkqYNtPLjt5Ij3T6eFUlHs5URnkD5Hw6qeUDQrAQd4/K+vnqqlCYIwB8g7AX//f77AnQ/tzThIASA0jRbKX16+mM7BKI/uabGP9YbiWQu4XyJwQRAmQd4qyIHWvozHreZS02GhrJ9v5Hd3DhpDHQ629rO/pY+Kouz87UCGocuCIAjZkncCbqXf7Tndm/H+4BgDjXNJwOvG53HRFzIslA/fux2ASHx4cU8m/J68+/ULgjCLyDsFKfIZSx5pUELQHGE22kSeXFJZ5LWHOpSaPcKtv8dipKHLgiAI2ZB3CpI0+0SFosmM99uFPNNkT1QUeek1I3BrzNoXbl6T1WNHGrosCIKQDdMTpuaARFLTG4oRTRjCHR5iU8QSSR7cdpLeUIxin3vaJtw4BTwYiXP5shrKAxMf1SYIgpAteSHgX39sPz/YcjTt2NACmvufP8GXf/0yAPUZeo9MFRVFXlp6w/zupVa2n+jm1Wsapu21BUGY2+SFhZIpoh26UTjoEPSywPR9LlkR+Ed+/CIwPfnngiAIkCcCvqapbNixoRG4c0OwdBotjPIir52FAlCS5QamIAjCZMkLAV/dWD7s2NCpNz5HSl7ZNIpoeZGX/kiqnH6qZ3EKgiBY5EW42FQR4C8uWchPt5+yKzCHTr1xzpPINo0vFwwVbL9nfAL+f25eM20pj4IgFBZ5oRxKKb7+lvPoDUV5dG8rMFzAw/FURF46jR548ZAPCytLJls+dOXSXC5HEIQ5RF5YKBbOEvXIUAF3/DyTEfjQdQmCIEwVeSXgi2qK7dtDI3Dnz+XTGYEPsT+GevOCIAhTRV5YKBYffNUSSnxunj/Wxd4hvVDCjqyU6bRQhqYNvvOS5ml7bUEQ5jZ5FYF73S5uvWwx5QHPiBH4/Moiu0vgdOCMwB/9qyu5cGHVtL22IAhzm7yKwC38Hnea5w0QiiVZUlvCk39zzbSuxRmBV5fImDRBEKaPvIrALYp8wwU8HEvMSHvWEkcEXjeNJfyCIAh5KeAlPjexhE4rpw/HEhTNQBGNs/JyuhpoCYIgQJ4KeLk5sqwvlKqADEUT09ZC1sl0TP4RBEHIRF4KuDVzstfRgyQUmxkBl6k6giDMFHmpPnYEHk4X8JmYMWkNZVg/f3i/FkEQhKkkL7NQnBH4s0c62Heml1A0MWOtXP9853VUZjmJXhAEIVeMKeBKqWbgXqARSAJ3a62/q5SqBn4CLAaOA+/QWndP3VJTWP3B+0Ix7nhwF2D0AJ+pplDzK4tm5HUFQZjbZGOhxIHPaq3XAJuBTyil1gJ3Ak9orVcAT5g/TwtWBN7eH7GP9YfjMkxBEIQ5xZgCrrVu0VrvMG/3A/uB+cAtwI/M034EvGmK1jgMS8D/bciYNWnLKgjCXGJcm5hKqcXARmAr0KC1bgFD5IH6ER5zu1Jqu1Jqe3t7+ySXa+DzuCjyuukYiOBMvZaUPkEQ5hJZC7hSqhT4OfBprXVfto/TWt+ttd6ktd5UV1c3kTVmpLbMKFt/58UL7WMlEoELgjCHyErAlVJeDPG+X2v9kHn4nFKqyby/CWibmiVmZmOz0TTqutX1mJl8FIsHLgjCHGJMAVdGovM9wH6t9V2Ou34F3Gbevg34Ze6XNzJ/d8s6PnfDKq5dVUddqdGDRCJwQRDmEtlE4FcAtwLXKaV2mX9uAr4BvEYpdQh4jfnztFFZ7OMT1y7H43bRWBEAxAMXBGFuMWbIqrV+BhipS9P1uV3OxGgsD7CH3rTGUoIgCIVOXpbSD0UicEEQ5iIFJuASgQuCMHcoCMV7w4Z5RGJJGsploIIgCHOHghDw5upi/vo1K2d6GYIgCNNKQVgogiAIcxERcEEQhDxFBFwQBCFPEQEXBEHIU0TABUEQ8hQRcEEQhDxFBFwQBCFPEQEXBEHIU5TWevpeTKl24MQEH14LdORwObOVuXKdMHeuVa6z8Jjua12ktR42EWdaBXwyKKW2a603zfQ6ppq5cp0wd65VrrPwmC3XKhaKIAhCniICLgiCkKfkk4DfPdMLmCbmynXC3LlWuc7CY1Zca9544IIgCEI6+RSBC4IgCA5EwAVBEPKUvBBwpdTrlFIHlVKHlVJ3zvR6JoNS6j+UUm1KqX2OY9VKqceVUofMv6sc933evO6DSqkbZmbV40cp1ayUelIptV8p9ZJS6g7zeEFdq1IqoJTappTabV7nV8zjBXWdFkopt1Jqp1LqEfPnQr3O40qpvUqpXUqp7eax2XetWutZ/QdwA0eApYAP2A2snel1TeJ6rgIuBPY5jn0TuNO8fSfwD+btteb1+oEl5u/BPdPXkOV1NgEXmrfLgFfM6ymoawUUUGre9gJbgc2Fdp2O6/0M8ADwiPlzoV7ncaB2yLFZd635EIFfAhzWWh/VWkeBB4FbZnhNE0Zr/Sega8jhW4Afmbd/BLzJcfxBrXVEa30MOIzx+5j1aK1btNY7zNv9wH5gPgV2rdpgwPzRa/7RFNh1AiilFgA3Az90HC646xyFWXet+SDg84FTjp9Pm8cKiQatdQsYwgfUm8cL4tqVUouBjRjRacFdq2kr7ALagMe11gV5ncA/Af8LSDqOFeJ1gvEh/Hul1ItKqdvNY7PuWvNhqLHKcGyu5D7m/bUrpUqBnwOf1lr3KZXpkoxTMxzLi2vVWieAC5RSlcDDSqn1o5yel9eplHo90Ka1flEpdU02D8lwbNZfp4MrtNZnlVL1wONKqQOjnDtj15oPEfhpoNnx8wLg7AytZao4p5RqAjD/bjOP5/W1K6W8GOJ9v9b6IfNwQV4rgNa6B3gKeB2Fd51XAG9USh3HsDGvU0rdR+FdJwBa67Pm323AwxiWyKy71nwQ8BeAFUqpJUopH/Au4FczvKZc8yvgNvP2bcAvHcffpZTyK6WWACuAbTOwvnGjjFD7HmC/1voux10Fda1KqToz8kYpVQS8GjhAgV2n1vrzWusFWuvFGP8H/6i1fi8Fdp0ASqkSpVSZdRt4LbCP2XitM73bm+WO8E0YWQxHgC/O9HomeS3/DbQAMYxP7g8CNcATwCHz72rH+V80r/sgcONMr38c1/kqjK+Re4Bd5p+bCu1agQ3ATvM69wFfMo8X1HUOueZrSGWhFNx1YmS87Tb/vGRpzmy8VimlFwRByFPywUIRBEEQMiACLgiCkKeIgAuCIOQpIuCCIAh5igi4IAhCniICLgiCkKeIgAuCIOQp/x/JaILIIWB+KQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "stockdata['Low'].plot()" ] }, { "cell_type": "markdown", "metadata": { "id": "0t25tjnBtXhx" }, "source": [ "## Trouble with your CSV files?\n", "For more info on how to format your 'read_csv' commands, or if you're running into problems related to the comma-versus-tab-versus-semicolon issue, look at the help function:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "bZmQD21ztXhz" }, "outputs": [ { "data": { "text/plain": [ "\u001b[0;31mSignature:\u001b[0m\n", "\u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_csv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'FilePathOrBuffer'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0msep\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m<\u001b[0m\u001b[0mno_default\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdelimiter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mheader\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'infer'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mnames\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m<\u001b[0m\u001b[0mno_default\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mindex_col\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0musecols\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0msqueeze\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mprefix\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m<\u001b[0m\u001b[0mno_default\u001b[0m\u001b[0;34m>\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mmangle_dupe_cols\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'DtypeArg | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mengine\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mconverters\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mtrue_values\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mfalse_values\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mskipinitialspace\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mskiprows\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mskipfooter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mnrows\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mna_values\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mkeep_default_na\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mna_filter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mskip_blank_lines\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mparse_dates\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0minfer_datetime_format\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mkeep_date_col\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdate_parser\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdayfirst\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcache_dates\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0miterator\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mchunksize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcompression\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'infer'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mthousands\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdecimal\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'.'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mlineterminator\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mquotechar\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'\"'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mquoting\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdoublequote\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mescapechar\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcomment\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mencoding\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mencoding_errors\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str | None'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'strict'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdialect\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0merror_bad_lines\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mwarn_bad_lines\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mon_bad_lines\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdelim_whitespace\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mlow_memory\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mmemory_map\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mfloat_precision\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstorage_options\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'StorageOptions'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m\n", "Read a comma-separated values (csv) file into DataFrame.\n", "\n", "Also supports optionally iterating or breaking of the file\n", "into chunks.\n", "\n", "Additional help can be found in the online docs for\n", "`IO Tools `_.\n", "\n", "Parameters\n", "----------\n", "filepath_or_buffer : str, path object or file-like object\n", " Any valid string path is acceptable. The string could be a URL. Valid\n", " URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is\n", " expected. A local file could be: file://localhost/path/to/table.csv.\n", "\n", " If you want to pass in a path object, pandas accepts any ``os.PathLike``.\n", "\n", " By file-like object, we refer to objects with a ``read()`` method, such as\n", " a file handle (e.g. via builtin ``open`` function) or ``StringIO``.\n", "sep : str, default ','\n", " Delimiter to use. If sep is None, the C engine cannot automatically detect\n", " the separator, but the Python parsing engine can, meaning the latter will\n", " be used and automatically detect the separator by Python's builtin sniffer\n", " tool, ``csv.Sniffer``. In addition, separators longer than 1 character and\n", " different from ``'\\s+'`` will be interpreted as regular expressions and\n", " will also force the use of the Python parsing engine. Note that regex\n", " delimiters are prone to ignoring quoted data. Regex example: ``'\\r\\t'``.\n", "delimiter : str, default ``None``\n", " Alias for sep.\n", "header : int, list of int, default 'infer'\n", " Row number(s) to use as the column names, and the start of the\n", " data. Default behavior is to infer the column names: if no names\n", " are passed the behavior is identical to ``header=0`` and column\n", " names are inferred from the first line of the file, if column\n", " names are passed explicitly then the behavior is identical to\n", " ``header=None``. Explicitly pass ``header=0`` to be able to\n", " replace existing names. The header can be a list of integers that\n", " specify row locations for a multi-index on the columns\n", " e.g. [0,1,3]. Intervening rows that are not specified will be\n", " skipped (e.g. 2 in this example is skipped). Note that this\n", " parameter ignores commented lines and empty lines if\n", " ``skip_blank_lines=True``, so ``header=0`` denotes the first line of\n", " data rather than the first line of the file.\n", "names : array-like, optional\n", " List of column names to use. If the file contains a header row,\n", " then you should explicitly pass ``header=0`` to override the column names.\n", " Duplicates in this list are not allowed.\n", "index_col : int, str, sequence of int / str, or False, default ``None``\n", " Column(s) to use as the row labels of the ``DataFrame``, either given as\n", " string name or column index. If a sequence of int / str is given, a\n", " MultiIndex is used.\n", "\n", " Note: ``index_col=False`` can be used to force pandas to *not* use the first\n", " column as the index, e.g. when you have a malformed file with delimiters at\n", " the end of each line.\n", "usecols : list-like or callable, optional\n", " Return a subset of the columns. If list-like, all elements must either\n", " be positional (i.e. integer indices into the document columns) or strings\n", " that correspond to column names provided either by the user in `names` or\n", " inferred from the document header row(s). For example, a valid list-like\n", " `usecols` parameter would be ``[0, 1, 2]`` or ``['foo', 'bar', 'baz']``.\n", " Element order is ignored, so ``usecols=[0, 1]`` is the same as ``[1, 0]``.\n", " To instantiate a DataFrame from ``data`` with element order preserved use\n", " ``pd.read_csv(data, usecols=['foo', 'bar'])[['foo', 'bar']]`` for columns\n", " in ``['foo', 'bar']`` order or\n", " ``pd.read_csv(data, usecols=['foo', 'bar'])[['bar', 'foo']]``\n", " for ``['bar', 'foo']`` order.\n", "\n", " If callable, the callable function will be evaluated against the column\n", " names, returning names where the callable function evaluates to True. An\n", " example of a valid callable argument would be ``lambda x: x.upper() in\n", " ['AAA', 'BBB', 'DDD']``. Using this parameter results in much faster\n", " parsing time and lower memory usage.\n", "squeeze : bool, default False\n", " If the parsed data only contains one column then return a Series.\n", "prefix : str, optional\n", " Prefix to add to column numbers when no header, e.g. 'X' for X0, X1, ...\n", "mangle_dupe_cols : bool, default True\n", " Duplicate columns will be specified as 'X', 'X.1', ...'X.N', rather than\n", " 'X'...'X'. Passing in False will cause data to be overwritten if there\n", " are duplicate names in the columns.\n", "dtype : Type name or dict of column -> type, optional\n", " Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32,\n", " 'c': 'Int64'}\n", " Use `str` or `object` together with suitable `na_values` settings\n", " to preserve and not interpret dtype.\n", " If converters are specified, they will be applied INSTEAD\n", " of dtype conversion.\n", "engine : {'c', 'python'}, optional\n", " Parser engine to use. The C engine is faster while the python engine is\n", " currently more feature-complete.\n", "converters : dict, optional\n", " Dict of functions for converting values in certain columns. Keys can either\n", " be integers or column labels.\n", "true_values : list, optional\n", " Values to consider as True.\n", "false_values : list, optional\n", " Values to consider as False.\n", "skipinitialspace : bool, default False\n", " Skip spaces after delimiter.\n", "skiprows : list-like, int or callable, optional\n", " Line numbers to skip (0-indexed) or number of lines to skip (int)\n", " at the start of the file.\n", "\n", " If callable, the callable function will be evaluated against the row\n", " indices, returning True if the row should be skipped and False otherwise.\n", " An example of a valid callable argument would be ``lambda x: x in [0, 2]``.\n", "skipfooter : int, default 0\n", " Number of lines at bottom of file to skip (Unsupported with engine='c').\n", "nrows : int, optional\n", " Number of rows of file to read. Useful for reading pieces of large files.\n", "na_values : scalar, str, list-like, or dict, optional\n", " Additional strings to recognize as NA/NaN. If dict passed, specific\n", " per-column NA values. By default the following values are interpreted as\n", " NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN', '-NaN', '-nan',\n", " '1.#IND', '1.#QNAN', '', 'N/A', 'NA', 'NULL', 'NaN', 'n/a',\n", " 'nan', 'null'.\n", "keep_default_na : bool, default True\n", " Whether or not to include the default NaN values when parsing the data.\n", " Depending on whether `na_values` is passed in, the behavior is as follows:\n", "\n", " * If `keep_default_na` is True, and `na_values` are specified, `na_values`\n", " is appended to the default NaN values used for parsing.\n", " * If `keep_default_na` is True, and `na_values` are not specified, only\n", " the default NaN values are used for parsing.\n", " * If `keep_default_na` is False, and `na_values` are specified, only\n", " the NaN values specified `na_values` are used for parsing.\n", " * If `keep_default_na` is False, and `na_values` are not specified, no\n", " strings will be parsed as NaN.\n", "\n", " Note that if `na_filter` is passed in as False, the `keep_default_na` and\n", " `na_values` parameters will be ignored.\n", "na_filter : bool, default True\n", " Detect missing value markers (empty strings and the value of na_values). In\n", " data without any NAs, passing na_filter=False can improve the performance\n", " of reading a large file.\n", "verbose : bool, default False\n", " Indicate number of NA values placed in non-numeric columns.\n", "skip_blank_lines : bool, default True\n", " If True, skip over blank lines rather than interpreting as NaN values.\n", "parse_dates : bool or list of int or names or list of lists or dict, default False\n", " The behavior is as follows:\n", "\n", " * boolean. If True -> try parsing the index.\n", " * list of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3\n", " each as a separate date column.\n", " * list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as\n", " a single date column.\n", " * dict, e.g. {'foo' : [1, 3]} -> parse columns 1, 3 as date and call\n", " result 'foo'\n", "\n", " If a column or index cannot be represented as an array of datetimes,\n", " say because of an unparsable value or a mixture of timezones, the column\n", " or index will be returned unaltered as an object data type. For\n", " non-standard datetime parsing, use ``pd.to_datetime`` after\n", " ``pd.read_csv``. To parse an index or column with a mixture of timezones,\n", " specify ``date_parser`` to be a partially-applied\n", " :func:`pandas.to_datetime` with ``utc=True``. See\n", " :ref:`io.csv.mixed_timezones` for more.\n", "\n", " Note: A fast-path exists for iso8601-formatted dates.\n", "infer_datetime_format : bool, default False\n", " If True and `parse_dates` is enabled, pandas will attempt to infer the\n", " format of the datetime strings in the columns, and if it can be inferred,\n", " switch to a faster method of parsing them. In some cases this can increase\n", " the parsing speed by 5-10x.\n", "keep_date_col : bool, default False\n", " If True and `parse_dates` specifies combining multiple columns then\n", " keep the original columns.\n", "date_parser : function, optional\n", " Function to use for converting a sequence of string columns to an array of\n", " datetime instances. The default uses ``dateutil.parser.parser`` to do the\n", " conversion. Pandas will try to call `date_parser` in three different ways,\n", " advancing to the next if an exception occurs: 1) Pass one or more arrays\n", " (as defined by `parse_dates`) as arguments; 2) concatenate (row-wise) the\n", " string values from the columns defined by `parse_dates` into a single array\n", " and pass that; and 3) call `date_parser` once for each row using one or\n", " more strings (corresponding to the columns defined by `parse_dates`) as\n", " arguments.\n", "dayfirst : bool, default False\n", " DD/MM format dates, international and European format.\n", "cache_dates : bool, default True\n", " If True, use a cache of unique, converted dates to apply the datetime\n", " conversion. May produce significant speed-up when parsing duplicate\n", " date strings, especially ones with timezone offsets.\n", "\n", " .. versionadded:: 0.25.0\n", "iterator : bool, default False\n", " Return TextFileReader object for iteration or getting chunks with\n", " ``get_chunk()``.\n", "\n", " .. versionchanged:: 1.2\n", "\n", " ``TextFileReader`` is a context manager.\n", "chunksize : int, optional\n", " Return TextFileReader object for iteration.\n", " See the `IO Tools docs\n", " `_\n", " for more information on ``iterator`` and ``chunksize``.\n", "\n", " .. versionchanged:: 1.2\n", "\n", " ``TextFileReader`` is a context manager.\n", "compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'\n", " For on-the-fly decompression of on-disk data. If 'infer' and\n", " `filepath_or_buffer` is path-like, then detect compression from the\n", " following extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no\n", " decompression). If using 'zip', the ZIP file must contain only one data\n", " file to be read in. Set to None for no decompression.\n", "thousands : str, optional\n", " Thousands separator.\n", "decimal : str, default '.'\n", " Character to recognize as decimal point (e.g. use ',' for European data).\n", "lineterminator : str (length 1), optional\n", " Character to break file into lines. Only valid with C parser.\n", "quotechar : str (length 1), optional\n", " The character used to denote the start and end of a quoted item. Quoted\n", " items can include the delimiter and it will be ignored.\n", "quoting : int or csv.QUOTE_* instance, default 0\n", " Control field quoting behavior per ``csv.QUOTE_*`` constants. Use one of\n", " QUOTE_MINIMAL (0), QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or QUOTE_NONE (3).\n", "doublequote : bool, default ``True``\n", " When quotechar is specified and quoting is not ``QUOTE_NONE``, indicate\n", " whether or not to interpret two consecutive quotechar elements INSIDE a\n", " field as a single ``quotechar`` element.\n", "escapechar : str (length 1), optional\n", " One-character string used to escape other characters.\n", "comment : str, optional\n", " Indicates remainder of line should not be parsed. If found at the beginning\n", " of a line, the line will be ignored altogether. This parameter must be a\n", " single character. Like empty lines (as long as ``skip_blank_lines=True``),\n", " fully commented lines are ignored by the parameter `header` but not by\n", " `skiprows`. For example, if ``comment='#'``, parsing\n", " ``#empty\\na,b,c\\n1,2,3`` with ``header=0`` will result in 'a,b,c' being\n", " treated as the header.\n", "encoding : str, optional\n", " Encoding to use for UTF when reading/writing (ex. 'utf-8'). `List of Python\n", " standard encodings\n", " `_ .\n", "\n", " .. versionchanged:: 1.2\n", "\n", " When ``encoding`` is ``None``, ``errors=\"replace\"`` is passed to\n", " ``open()``. Otherwise, ``errors=\"strict\"`` is passed to ``open()``.\n", " This behavior was previously only the case for ``engine=\"python\"``.\n", "\n", " .. versionchanged:: 1.3.0\n", "\n", " ``encoding_errors`` is a new argument. ``encoding`` has no longer an\n", " influence on how encoding errors are handled.\n", "\n", "encoding_errors : str, optional, default \"strict\"\n", " How encoding errors are treated. `List of possible values\n", " `_ .\n", "\n", " .. versionadded:: 1.3.0\n", "\n", "dialect : str or csv.Dialect, optional\n", " If provided, this parameter will override values (default or not) for the\n", " following parameters: `delimiter`, `doublequote`, `escapechar`,\n", " `skipinitialspace`, `quotechar`, and `quoting`. If it is necessary to\n", " override values, a ParserWarning will be issued. See csv.Dialect\n", " documentation for more details.\n", "error_bad_lines : bool, default ``None``\n", " Lines with too many fields (e.g. a csv line with too many commas) will by\n", " default cause an exception to be raised, and no DataFrame will be returned.\n", " If False, then these \"bad lines\" will be dropped from the DataFrame that is\n", " returned.\n", "\n", " .. deprecated:: 1.3.0\n", " The ``on_bad_lines`` parameter should be used instead to specify behavior upon\n", " encountering a bad line instead.\n", "warn_bad_lines : bool, default ``None``\n", " If error_bad_lines is False, and warn_bad_lines is True, a warning for each\n", " \"bad line\" will be output.\n", "\n", " .. deprecated:: 1.3.0\n", " The ``on_bad_lines`` parameter should be used instead to specify behavior upon\n", " encountering a bad line instead.\n", "on_bad_lines : {'error', 'warn', 'skip'}, default 'error'\n", " Specifies what to do upon encountering a bad line (a line with too many fields).\n", " Allowed values are :\n", "\n", " - 'error', raise an Exception when a bad line is encountered.\n", " - 'warn', raise a warning when a bad line is encountered and skip that line.\n", " - 'skip', skip bad lines without raising or warning when they are encountered.\n", "\n", " .. versionadded:: 1.3.0\n", "\n", "delim_whitespace : bool, default False\n", " Specifies whether or not whitespace (e.g. ``' '`` or ``' '``) will be\n", " used as the sep. Equivalent to setting ``sep='\\s+'``. If this option\n", " is set to True, nothing should be passed in for the ``delimiter``\n", " parameter.\n", "low_memory : bool, default True\n", " Internally process the file in chunks, resulting in lower memory use\n", " while parsing, but possibly mixed type inference. To ensure no mixed\n", " types either set False, or specify the type with the `dtype` parameter.\n", " Note that the entire file is read into a single DataFrame regardless,\n", " use the `chunksize` or `iterator` parameter to return the data in chunks.\n", " (Only valid with C parser).\n", "memory_map : bool, default False\n", " If a filepath is provided for `filepath_or_buffer`, map the file object\n", " directly onto memory and access the data directly from there. Using this\n", " option can improve performance because there is no longer any I/O overhead.\n", "float_precision : str, optional\n", " Specifies which converter the C engine should use for floating-point\n", " values. The options are ``None`` or 'high' for the ordinary converter,\n", " 'legacy' for the original lower precision pandas converter, and\n", " 'round_trip' for the round-trip converter.\n", "\n", " .. versionchanged:: 1.2\n", "\n", "storage_options : dict, optional\n", " Extra options that make sense for a particular storage connection, e.g.\n", " host, port, username, password, etc. For HTTP(S) URLs the key-value pairs\n", " are forwarded to ``urllib`` as header options. For other URLs (e.g.\n", " starting with \"s3://\", and \"gcs://\") the key-value pairs are forwarded to\n", " ``fsspec``. Please see ``fsspec`` and ``urllib`` for more details.\n", "\n", " .. versionadded:: 1.2\n", "\n", "Returns\n", "-------\n", "DataFrame or TextParser\n", " A comma-separated values (csv) file is returned as two-dimensional\n", " data structure with labeled axes.\n", "\n", "See Also\n", "--------\n", "DataFrame.to_csv : Write DataFrame to a comma-separated values (csv) file.\n", "read_csv : Read a comma-separated values (csv) file into DataFrame.\n", "read_fwf : Read a table of fixed-width formatted lines into DataFrame.\n", "\n", "Examples\n", "--------\n", ">>> pd.read_csv('data.csv') # doctest: +SKIP\n", "\u001b[0;31mFile:\u001b[0m ~/opt/anaconda3/envs/dj21/lib/python3.7/site-packages/pandas/io/parsers/readers.py\n", "\u001b[0;31mType:\u001b[0m function\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "pd.read_csv?" ] }, { "cell_type": "markdown", "metadata": { "id": "RM4IoAuotXh5" }, "source": [ "## What if the data isn't in .csv format, but is online? \n", " \n", " There is actually a very simple, brilliant, scraping tool that allows you to grab content from tables online and turn that into a csv file. Then you can use the tools we just used to analyze the csv file (including saving it to your computer and importing it into jupyter for analysis). The tool is called read_html and allows you to basically put in any website URL and scrape the tables from it. It probably won't work with all websites (and probably not everything it scrapes is relevant/useful to you), but, it is really handy when it does work. Let's look, for example, at wikipedia's page involving the premier Dutch football league.\n", " \n", " First, load the URL into your browswer in another tab to look at the original page." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "gGVHM3fWtXh6" }, "outputs": [], "source": [ "alltables = pd.read_html('https://en.wikipedia.org/wiki/Eredivisie')" ] }, { "cell_type": "markdown", "metadata": { "id": "bQlrritQtXiF" }, "source": [ "Look at the following code carefully to see what we're doing here. We're introducing a new method (\"format\") which works for any string; this fills in a value between curly brackets. We also are using function we already know from the first part of today's lesson: \"len\". " ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "-o1mn7JxtXiF" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "We have downloaded 19 tables\n" ] } ], "source": [ "print('We have downloaded {} tables'.format(len(alltables)))" ] }, { "cell_type": "markdown", "metadata": { "id": "IAD4vQvEtXiK" }, "source": [ "Here is another, perhaps simpler way to do this, but also less versatile if you want to do fancier stuff someday. \n", "\n", "**The point here is that there are multiple ways to do many things in python; we just want you to master one way and know why it's useful to you.**" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "id": "sO0KxOqytXiN" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "We have downloaded 19 tables.\n" ] } ], "source": [ "print('We have downloaded', len(alltables), 'tables.')" ] }, { "cell_type": "markdown", "metadata": { "id": "-jCpGItmtXiV" }, "source": [ "Let's look at, say, the third table in this set." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "iS3QZA-wtXiX" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ClubWinnerRunner-upWinning years
0Ajax35231917–18, 1918–19, 1930–31, 1931–32, 1933–34, 1...
1PSV Eindhoven24141928–29, 1934–35, 1950–51, 1962–63, 1974–75, 1...
2Feyenoord15211923–24, 1927–28, 1935–36, 1937–38, 1939–40, 1...
3HVV Den Haag1011890–91, 1895–96, 1899–1900, 1900–01, 1901–02,...
4Sparta Rotterdam61908–09, 1910–11, 1911–12, 1912–13, 1914–15, 1...
5RAP531891–92, 1893–94, 1896–97, 1897–98, 1898–99
6Go Ahead Eagles451916–17, 1921–22, 1929–30, 1932–33
7Koninklijke HFC331889–90, 1892–93, 1894–95
8Willem II311915–16, 1951–52, 1954–55
9HBS Craeyenhout31903–04, 1905–06, 1924–25
10AZ231980–81, 2008–09
11Heracles Almelo211926–27, 1940–41
12ADO Den Haag21941–42, 1942–43
13RCH21922–23, 1952–53
14NAC Breda141920–21
15FC Twente132009–10
16DWS131963–64
17Roda JC Kerkrade*121955–56
18Be Quick121919–20
19FC Eindhoven121953–54
20SC Enschede111925–26
21DOS111957–58
22FC Den Bosch111947–48
23De Volewijckers11943–44
24HFC Haarlem11945–46
25Limburgia11949–50
26SVV11948–49
27Quick Den Haag11907–08
28VV Concordia11888–89
\n", "
" ], "text/plain": [ " Club Winner Runner-up \\\n", "0 Ajax 35 23 \n", "1 PSV Eindhoven 24 14 \n", "2 Feyenoord 15 21 \n", "3 HVV Den Haag 10 1 \n", "4 Sparta Rotterdam 6 – \n", "5 RAP 5 3 \n", "6 Go Ahead Eagles 4 5 \n", "7 Koninklijke HFC 3 3 \n", "8 Willem II 3 1 \n", "9 HBS Craeyenhout 3 – \n", "10 AZ 2 3 \n", "11 Heracles Almelo 2 1 \n", "12 ADO Den Haag 2 – \n", "13 RCH 2 – \n", "14 NAC Breda 1 4 \n", "15 FC Twente 1 3 \n", "16 DWS 1 3 \n", "17 Roda JC Kerkrade* 1 2 \n", "18 Be Quick 1 2 \n", "19 FC Eindhoven 1 2 \n", "20 SC Enschede 1 1 \n", "21 DOS 1 1 \n", "22 FC Den Bosch 1 1 \n", "23 De Volewijckers 1 – \n", "24 HFC Haarlem 1 – \n", "25 Limburgia 1 – \n", "26 SVV 1 – \n", "27 Quick Den Haag 1 – \n", "28 VV Concordia 1 – \n", "\n", " Winning years \n", "0 1917–18, 1918–19, 1930–31, 1931–32, 1933–34, 1... \n", "1 1928–29, 1934–35, 1950–51, 1962–63, 1974–75, 1... \n", "2 1923–24, 1927–28, 1935–36, 1937–38, 1939–40, 1... \n", "3 1890–91, 1895–96, 1899–1900, 1900–01, 1901–02,... \n", "4 1908–09, 1910–11, 1911–12, 1912–13, 1914–15, 1... \n", "5 1891–92, 1893–94, 1896–97, 1897–98, 1898–99 \n", "6 1916–17, 1921–22, 1929–30, 1932–33 \n", "7 1889–90, 1892–93, 1894–95 \n", "8 1915–16, 1951–52, 1954–55 \n", "9 1903–04, 1905–06, 1924–25 \n", "10 1980–81, 2008–09 \n", "11 1926–27, 1940–41 \n", "12 1941–42, 1942–43 \n", "13 1922–23, 1952–53 \n", "14 1920–21 \n", "15 2009–10 \n", "16 1963–64 \n", "17 1955–56 \n", "18 1919–20 \n", "19 1953–54 \n", "20 1925–26 \n", "21 1957–58 \n", "22 1947–48 \n", "23 1943–44 \n", "24 1945–46 \n", "25 1949–50 \n", "26 1948–49 \n", "27 1907–08 \n", "28 1888–89 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alltables[2]\n", "#why is this not 3? It is because python uses 0-based indexing, which means that data, values, rows, and items \n", "#in lists, which you've seen in your coding tutorials, all start at 0. So the first object in a list or index\n", "#is always in position \"0\", and the second in position \"1\", and so on. In this case, to get the 3rd table in\n", "#this new little set of tables, we have to specify \"2\" rather than \"3\"." ] }, { "cell_type": "markdown", "metadata": { "id": "75Lz644ZtXib" }, "source": [ "Now we can save this table to a csv file, which we will call 'test':" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "id": "Fa1RfATEtXid" }, "outputs": [], "source": [ "alltables[2].to_csv('test.csv')" ] }, { "cell_type": "markdown", "metadata": { "id": "MEzZ-eJgtXil" }, "source": [ "Now, see if you can go back and read in this test.csv file, have a look at the dataset. \n", "\n", "If we had more time, we would try to figure out how to rename the columns, and play around with plotting the number of times each team won, for example. (Try this at home, and see if you can do it! Using the read help command from earlier should help you figure out how to rename columns...or when in doubt, just search online for help!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bi1cgMsBtXin" }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "qIsaJ6BvtXiv" }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "id": "kXoltZD9tXi3" }, "source": [ "## JSON files\n", "\n", "Another type of file we frequently encounter online is the so-called \"jason\" file - aka, JSON. JSON files allow for nested data structures--like databases.\n", "\n", "JSON is (basically) the same as a collection of Python dicts (dictionaries--we haven't talked about these yet in class, but you did learn about this in your coding tutorials. As a reminder, dicts are collections of key:value pairs, which means you have a category of something (key) and values within it (values)). I'll explain this in class more. Bottom line: it's very easy to look up things by their key, but not by their values. So, knowing our way around these dicts and how they are nested within one another - in a json file - is important.\n", "\n", "Let's download such a file and store it in the same directory as your jupyter notebook.\n", "Download https://open.data.amsterdam.nl/EtenDrinken.json .\n", "\n", "First, see what happens if you load this link in your browser. You can get a feel for the structure of the dataset, if your browser is relatively fancy.\n", "\n", "\n", "Next: we could use pandas to put the JSON file into a table (see next command) -- but as you see, because the data is *nested*, we still have dicts within some of the cells:\n", "\n", "**Note:** The location (often called _path_) where you stored the file is important to remember when you load the data into your notebook. I have saved the json file to a folder called _datasets_, which is located one folder 'above' the current folder where our notebook reside in. So, I have to tell Python to go back one folder ('../'), then into the datasets folder (datasets/), and from there open the file 'EtenDrinken.json'. If you stored the datafile in the same folder as your notebook is in, you can just load it by providing the file name! \n", "\n", "**Where am I?** If you are unsure where your notebook is running from, simply use the following cell magic to get the path to your current notebook:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/fhopp/Library/Mobile Documents/com~apple~CloudDocs/FRH/Science/ASCoR/Teaching/data_journalism/21/book_dj/content\n" ] } ], "source": [ "!pwd" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "id": "c3bHxPeetXi4" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
trcidtitledetailstypeslocationurlsmediadateslastupdatedeigenschappen
0853c93aa-4599-41ad-9d79-638902925dc9Eetsalon van Dobben B.V.{'de': {'language': 'de', 'title': 'Eetsalon v...[{'type': '', 'catid': '3.2.1'}]{'name': '', 'city': 'AMSTERDAM', 'adress': 'K...[http://www.eetsalonvandobben.nl, http://www.f...[{'url': 'https://media.iamsterdam.com/ndtrc/I...{'startdate': '27-01-2011', 'enddate': ''}2016-09-19 13:58:20{'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat...
1a9dca5a2-aaeb-4d11-9025-f8ab29d6e042Eetsalon van Dobben de Pijp{'de': {'language': 'de', 'title': 'Eetsalon v...[{'type': '', 'catid': '3.2.1'}]{'name': 'Eetsalon van Dobben de Pijp', 'city'...[][{'url': 'https://media.iamsterdam.com/ndtrc/I...[]2016-04-03 10:40:22{'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat...
22ed5f9de-2850-4bcd-affe-4b19b4144906Cora Delicatessen & Broodjes{'de': {'language': 'de', 'title': 'Cora Delic...[{'type': '', 'catid': '3.2.1'}]{'name': 'Cora Delicatessen & Broodjes', 'city...[http://www.cora-broodjes.nl/][{'url': 'https://media.iamsterdam.com/ndtrc/I...[]2015-06-23 10:08:59[]
354d3daf4-610d-4328-a657-65a05c047eb9Jonk Haringhandel{'de': {'language': 'de', 'title': 'Jonk Harin...[{'type': '', 'catid': '3.2.1'}]{'name': 'Jonk Haringhandel', 'city': 'AMSTERD...[][{'url': 'https://media.iamsterdam.com/ndtrc/I...[]2017-01-17 17:17:36{'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat...
441de3334-b0b2-43be-b226-b39b106d96f6Vlaamsch Broodhuys{'de': {'language': 'de', 'title': 'Vlaamsch B...[{'type': '', 'catid': '3.2.1'}]{'name': 'Vlaamsch Broodhuys', 'city': 'AMSTER...[http://www.vlaamschbroodhuys.nl/index.php/gb][{'url': 'https://media.iamsterdam.com/ndtrc/I...{'startdate': '08-08-2013', 'enddate': ''}2017-01-18 14:56:34{'42.2': {'Catid': '42.2', 'Value': {}, 'Categ...
.................................
76707c43eb7-dc3f-4643-933e-6d8c7cfc39afStrandpaviljoen Timboektoe{'de': {'language': 'de', 'title': 'Strandbar ...[{'type': '', 'catid': '3.1.5'}]{'name': 'Strandpaviljoen Timboektoe', 'city':...[http://www.timboektoe.org, http://www.faceboo...[{'url': 'https://media.ndtrc.nl/Images/07/07c...[]2017-07-25 14:04:21{'34.2': {'Catid': '34.2', 'Value': {}, 'Categ...
768680a178e-0941-4db0-bb6a-120710536630Paviljoen Zeezicht{'de': {'language': 'de', 'title': 'Strandbar ...[{'type': '', 'catid': '3.1.5'}]{'name': 'Paviljoen Zeezicht', 'city': 'IJMUID...[http://www.paviljoenzeezicht.nl][{'url': 'https://media.ndtrc.nl/Images/201102...[]2017-07-26 12:15:29{'34.2': {'Catid': '34.2', 'Value': {}, 'Categ...
769eaca1623-baef-4e4b-9fcb-c19de652e042Paviljoen Nova Zembla{'de': {'language': 'de', 'title': 'Strandbar ...[{'type': '', 'catid': '3.1.5'}]{'name': '', 'city': 'IJMUIDEN', 'adress': 'Ke...[http://www.paviljoennovazembla.nl/][{'url': 'https://media.ndtrc.nl/Images/201202...[]2017-07-26 12:14:43{'10.11': {'Catid': '10.11', 'Value': 'True', ...
770b75d4392-ae17-4820-a62a-1ead2a604ab2Beach Inn{'de': {'language': 'de', 'title': 'Beach Inn'...[{'type': '', 'catid': '3.1.5'}]{'name': 'Beach Inn', 'city': 'IJMUIDEN', 'adr...[http://www.beachinn-events.nl/, http://www.fa...[{'url': 'https://media.ndtrc.nl/Images/201303...[]2017-07-26 12:13:07{'34.13': {'Catid': '34.13', 'Value': 'true'},...
7717e7be9d0-1391-4875-b284-ea93ea87e0ddPaviljoen Noordzee{'de': {'language': 'de', 'title': 'Strandbar ...[{'type': '', 'catid': '3.1.5'}]{'name': '', 'city': 'IJMUIDEN', 'adress': 'Ke...[http://www.paviljoennoordzee.nl][{'url': 'https://media.ndtrc.nl/Images/201101...[]2017-07-26 12:13:56{'34.1': {'Catid': '34.1', 'Value': {}, 'Categ...
\n", "

772 rows × 10 columns

\n", "
" ], "text/plain": [ " trcid title \\\n", "0 853c93aa-4599-41ad-9d79-638902925dc9 Eetsalon van Dobben B.V. \n", "1 a9dca5a2-aaeb-4d11-9025-f8ab29d6e042 Eetsalon van Dobben de Pijp \n", "2 2ed5f9de-2850-4bcd-affe-4b19b4144906 Cora Delicatessen & Broodjes \n", "3 54d3daf4-610d-4328-a657-65a05c047eb9 Jonk Haringhandel \n", "4 41de3334-b0b2-43be-b226-b39b106d96f6 Vlaamsch Broodhuys \n", ".. ... ... \n", "767 07c43eb7-dc3f-4643-933e-6d8c7cfc39af Strandpaviljoen Timboektoe \n", "768 680a178e-0941-4db0-bb6a-120710536630 Paviljoen Zeezicht \n", "769 eaca1623-baef-4e4b-9fcb-c19de652e042 Paviljoen Nova Zembla \n", "770 b75d4392-ae17-4820-a62a-1ead2a604ab2 Beach Inn \n", "771 7e7be9d0-1391-4875-b284-ea93ea87e0dd Paviljoen Noordzee \n", "\n", " details \\\n", "0 {'de': {'language': 'de', 'title': 'Eetsalon v... \n", "1 {'de': {'language': 'de', 'title': 'Eetsalon v... \n", "2 {'de': {'language': 'de', 'title': 'Cora Delic... \n", "3 {'de': {'language': 'de', 'title': 'Jonk Harin... \n", "4 {'de': {'language': 'de', 'title': 'Vlaamsch B... \n", ".. ... \n", "767 {'de': {'language': 'de', 'title': 'Strandbar ... \n", "768 {'de': {'language': 'de', 'title': 'Strandbar ... \n", "769 {'de': {'language': 'de', 'title': 'Strandbar ... \n", "770 {'de': {'language': 'de', 'title': 'Beach Inn'... \n", "771 {'de': {'language': 'de', 'title': 'Strandbar ... \n", "\n", " types \\\n", "0 [{'type': '', 'catid': '3.2.1'}] \n", "1 [{'type': '', 'catid': '3.2.1'}] \n", "2 [{'type': '', 'catid': '3.2.1'}] \n", "3 [{'type': '', 'catid': '3.2.1'}] \n", "4 [{'type': '', 'catid': '3.2.1'}] \n", ".. ... \n", "767 [{'type': '', 'catid': '3.1.5'}] \n", "768 [{'type': '', 'catid': '3.1.5'}] \n", "769 [{'type': '', 'catid': '3.1.5'}] \n", "770 [{'type': '', 'catid': '3.1.5'}] \n", "771 [{'type': '', 'catid': '3.1.5'}] \n", "\n", " location \\\n", "0 {'name': '', 'city': 'AMSTERDAM', 'adress': 'K... \n", "1 {'name': 'Eetsalon van Dobben de Pijp', 'city'... \n", "2 {'name': 'Cora Delicatessen & Broodjes', 'city... \n", "3 {'name': 'Jonk Haringhandel', 'city': 'AMSTERD... \n", "4 {'name': 'Vlaamsch Broodhuys', 'city': 'AMSTER... \n", ".. ... \n", "767 {'name': 'Strandpaviljoen Timboektoe', 'city':... \n", "768 {'name': 'Paviljoen Zeezicht', 'city': 'IJMUID... \n", "769 {'name': '', 'city': 'IJMUIDEN', 'adress': 'Ke... \n", "770 {'name': 'Beach Inn', 'city': 'IJMUIDEN', 'adr... \n", "771 {'name': '', 'city': 'IJMUIDEN', 'adress': 'Ke... \n", "\n", " urls \\\n", "0 [http://www.eetsalonvandobben.nl, http://www.f... \n", "1 [] \n", "2 [http://www.cora-broodjes.nl/] \n", "3 [] \n", "4 [http://www.vlaamschbroodhuys.nl/index.php/gb] \n", ".. ... \n", "767 [http://www.timboektoe.org, http://www.faceboo... \n", "768 [http://www.paviljoenzeezicht.nl] \n", "769 [http://www.paviljoennovazembla.nl/] \n", "770 [http://www.beachinn-events.nl/, http://www.fa... \n", "771 [http://www.paviljoennoordzee.nl] \n", "\n", " media \\\n", "0 [{'url': 'https://media.iamsterdam.com/ndtrc/I... \n", "1 [{'url': 'https://media.iamsterdam.com/ndtrc/I... \n", "2 [{'url': 'https://media.iamsterdam.com/ndtrc/I... \n", "3 [{'url': 'https://media.iamsterdam.com/ndtrc/I... \n", "4 [{'url': 'https://media.iamsterdam.com/ndtrc/I... \n", ".. ... \n", "767 [{'url': 'https://media.ndtrc.nl/Images/07/07c... \n", "768 [{'url': 'https://media.ndtrc.nl/Images/201102... \n", "769 [{'url': 'https://media.ndtrc.nl/Images/201202... \n", "770 [{'url': 'https://media.ndtrc.nl/Images/201303... \n", "771 [{'url': 'https://media.ndtrc.nl/Images/201101... \n", "\n", " dates lastupdated \\\n", "0 {'startdate': '27-01-2011', 'enddate': ''} 2016-09-19 13:58:20 \n", "1 [] 2016-04-03 10:40:22 \n", "2 [] 2015-06-23 10:08:59 \n", "3 [] 2017-01-17 17:17:36 \n", "4 {'startdate': '08-08-2013', 'enddate': ''} 2017-01-18 14:56:34 \n", ".. ... ... \n", "767 [] 2017-07-25 14:04:21 \n", "768 [] 2017-07-26 12:15:29 \n", "769 [] 2017-07-26 12:14:43 \n", "770 [] 2017-07-26 12:13:07 \n", "771 [] 2017-07-26 12:13:56 \n", "\n", " eigenschappen \n", "0 {'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat... \n", "1 {'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat... \n", "2 [] \n", "3 {'3.6': {'Catid': '3.6', 'Value': 'True', 'Cat... \n", "4 {'42.2': {'Catid': '42.2', 'Value': {}, 'Categ... \n", ".. ... \n", "767 {'34.2': {'Catid': '34.2', 'Value': {}, 'Categ... \n", "768 {'34.2': {'Catid': '34.2', 'Value': {}, 'Categ... \n", "769 {'10.11': {'Catid': '10.11', 'Value': 'True', ... \n", "770 {'34.13': {'Catid': '34.13', 'Value': 'true'},... \n", "771 {'34.1': {'Catid': '34.1', 'Value': {}, 'Categ... \n", "\n", "[772 rows x 10 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_json('EtenDrinken.json')\n", "# pd.read_json('EtenDrinken.json') if your file is in the same folder as this notebook" ] }, { "cell_type": "markdown", "metadata": { "id": "dfWU9hn2tXi9" }, "source": [ "Sometimes, pandas can be an easy solution for dealing with JSON files, but in this case, it doesn't seem to be the best choice. \n", "\n", "So, let's read the JSON file into a list of dictionaries instead, since most of these columns seem to include dictionaries. We're going to call it \"eat\", this new list of dictionaries, because we know from the site this has something to do with eating and drinking." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "id": "dTiSGTPQtXjB" }, "outputs": [], "source": [ "eat = json.load(open('EtenDrinken.json'))\n", "#note: nothing happens in terms of output for this command. \n", "#but now it's in a format we can more easily explore in python." ] }, { "cell_type": "markdown", "metadata": { "id": "BnY_hANstXjH" }, "source": [ "### Playing around with nested JSON data and extracting meaningful information\n", "\n", "NOTE!! You don't need to be able to do all of this already, but it's mostly important that you try to understand the logic behind these various commands. We'll review a lot of this later on when we get to analysis, anyway." ] }, { "cell_type": "markdown", "metadata": { "id": "uxWbv0iqtXjI" }, "source": [ "Let's check what `eat` is and what is in there" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "id": "ceqfHlVUtXjK" }, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(eat)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "id": "zXxkjHZmtXjQ" }, "outputs": [ { "data": { "text/plain": [ "772" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(eat)" ] }, { "cell_type": "markdown", "metadata": { "id": "CKoeKasjtXjX" }, "source": [ "Maybe let's just look at the *first* restaurant" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "id": "sWBM9F4OtXjX" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'dates': {'enddate': '', 'startdate': '27-01-2011'},\n", " 'details': {'de': {'calendarsummary': 'Mo -Mi : 10:00 - 21:00 Uhr\\n'\n", " 'Do : 10:00 - 01:00 Uhr\\n'\n", " 'Fr , Sa : 10:00 - 02:00 Uhr\\n'\n", " 'So : 10:30 - 20:00 Uhr.',\n", " 'language': 'de',\n", " 'longdescription': '',\n", " 'shortdescription': '',\n", " 'title': 'Eetsalon van Dobben B.V.'},\n", " 'en': {'calendarsummary': 'Mo -We : 10:00 - 21:00 hour\\n'\n", " 'Th : 10:00 - 01:00 hour\\n'\n", " 'Fr , Sa : 10:00 - 02:00 hour\\n'\n", " 'Su: 10:30 - 20:00 hour.',\n", " 'language': 'en',\n", " 'longdescription': '',\n", " 'shortdescription': 'This typical Amsterdam deluxe '\n", " 'lunchroom is the meeting place of '\n", " 'artists, soccer players and many '\n", " 'other locals. Its Van Dobben '\n", " 'croquette is famed far beyond the '\n", " 'city limits.',\n", " 'title': 'Eetsalon van Dobben B.V.'},\n", " 'es': {'calendarsummary': 'Lu -Mi : 10:00 - 21:00 Hora\\n'\n", " 'Ju : 10:00 - 01:00 Hora\\n'\n", " 'Vi , Sa : 10:00 - 02:00 Hora\\n'\n", " 'Do : 10:30 - 20:00 Hora.',\n", " 'language': 'es',\n", " 'longdescription': '',\n", " 'shortdescription': '',\n", " 'title': 'Eetsalon van Dobben B.V.'},\n", " 'fr': {'calendarsummary': 'Lu -Me : 10:00 - 21:00 heure\\n'\n", " 'Je : 10:00 - 01:00 heure\\n'\n", " 'Ve , Sa : 10:00 - 02:00 heure\\n'\n", " 'Di : 10:30 - 20:00 heure.',\n", " 'language': 'fr',\n", " 'longdescription': '',\n", " 'shortdescription': '',\n", " 'title': 'Eetsalon van Dobben B.V.'},\n", " 'it': {'calendarsummary': 'Lu -Me : 10:00 - 21:00 Ora\\n'\n", " 'Gi : 10:00 - 01:00 Ora\\n'\n", " 'Ve , Sa : 10:00 - 02:00 Ora\\n'\n", " 'Do : 10:30 - 20:00 Ora.',\n", " 'language': 'it',\n", " 'longdescription': '',\n", " 'shortdescription': '',\n", " 'title': 'Eetsalon van Dobben B.V.'},\n", " 'nl': {'calendarsummary': 'Ma-wo: 10:00 - 21:00 uur\\n'\n", " 'do: 10:00 - 01:00 uur\\n'\n", " 'vr, za: 10:00 - 02:00 uur\\n'\n", " 'zo: 10:30 - 20:00 uur.',\n", " 'language': 'nl',\n", " 'longdescription': '',\n", " 'shortdescription': 'Deze typisch Amsterdamse luxe '\n", " 'lunchroom is een pleisterplaats voor '\n", " 'artiesten, voetballers en vele andere '\n", " 'Amsterdammers. Tot ver buiten de '\n", " 'stadsgrenzen is de Van Dobben croquet '\n", " 'een begrip.',\n", " 'title': 'Eetsalon van Dobben B.V.'}},\n", " 'eigenschappen': {'3.6': {'Category': 'Lid VVV',\n", " 'CategoryArea': 'Lidmaatschappen',\n", " 'Catid': '3.6',\n", " 'Value': 'True'},\n", " '34.1': {'Category': 'Type eetgelegenheid',\n", " 'CategoryArea': 'Eetgelegenheid',\n", " 'Catid': '34.1',\n", " 'Value': {}},\n", " '34.2': {'Category': 'Nationaliteit keuken',\n", " 'CategoryArea': 'Eetgelegenheid',\n", " 'Catid': '34.2',\n", " 'Value': {}}},\n", " 'lastupdated': '2016-09-19 13:58:20',\n", " 'location': {'adress': 'Korte Reguliersdwstr 5-9',\n", " 'city': 'AMSTERDAM',\n", " 'latitude': '52,3660560',\n", " 'longitude': '4,8953060',\n", " 'name': '',\n", " 'zipcode': '1017 BH'},\n", " 'media': [{'main': 'true',\n", " 'url': 'https://media.iamsterdam.com/ndtrc/Images/20110127/2e2b84ef-227d-42a9-ac47-992a26607175.jpg'}],\n", " 'title': 'Eetsalon van Dobben B.V.',\n", " 'trcid': '853c93aa-4599-41ad-9d79-638902925dc9',\n", " 'types': [{'catid': '3.2.1', 'type': ''}],\n", " 'urls': ['http://www.eetsalonvandobben.nl',\n", " 'http://www.facebook.com/Eetsalon-Van-Dobben-322959054713/']}\n" ] } ], "source": [ "pprint(eat[0])\n", "#pprint stands for 'pretty print'--it's not terribly pretty, \n", "#but nicer than if you do just a plain old print (try it out!)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "id": "LwJ1W9g7tXjd" }, "outputs": [], "source": [ "#do your normal print command here to see the value of pprint.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "1WNqipLttXjl" }, "source": [ "We can now directly access the elements we are intereted in:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "id": "kz-1ORvKtXjm" }, "outputs": [ { "data": { "text/plain": [ "'Eetsalon van Dobben B.V.'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eat[0]['details']['en']['title']" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "id": "gRj5WPOotXjr" }, "outputs": [ { "data": { "text/plain": [ "{'name': '',\n", " 'city': 'AMSTERDAM',\n", " 'adress': 'Korte Reguliersdwstr 5-9',\n", " 'zipcode': '1017 BH',\n", " 'latitude': '52,3660560',\n", " 'longitude': '4,8953060'}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eat[0]['location']" ] }, { "cell_type": "markdown", "metadata": { "id": "oIOm9vTMtXjx" }, "source": [ "We see that location is itself a dict with a number of key:value pairs. One of these is the zipcode. So if we want specifically the zipcode for the first restaurant, we have to enter both levels, essentially telling python to call up the first dict, and then look within that one for the second." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "id": "h5XGC1pttXj0" }, "outputs": [ { "data": { "text/plain": [ "'1017 BH'" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eat[0]['location']['zipcode']" ] }, { "cell_type": "markdown", "metadata": { "id": "cQfbqL9ctXj5" }, "source": [ "Let's say I want to figure out where the most restaurants are, by area, within Amsterdam. But I don't want to do this one-by-one.\n", "\n", "Once we know what we want, we can replace our specific restaurant `eat[0]` by a generic `restaurant` within a *loop*." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "id": "FZoIrzl1tXj6" }, "outputs": [], "source": [ "# let's get all zipcodes\n", "#first, we make a blank list.\n", "zipcodes = []\n", "\n", "#then, we make a loop, pulling the zipcode of each restaurant, and add that to the list with \"append\" as a METHOD.\n", "for restaurant in eat:\n", " zipcodes.append(restaurant['location']['zipcode'])" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "id": "j3JMw-d5tXj9" }, "outputs": [ { "data": { "text/plain": [ "772" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(zipcodes)" ] }, { "cell_type": "markdown", "metadata": { "id": "eIz8Gf-VtXkF" }, "source": [ "What do you think the purpose is of this previous step?\n", "\n", "\n", "Next, let's use a counter tool (something we imported above) to count the 20 most frequent zipcodes in this database. You could do 20, or 5, or 10, or 100 - whatever you want." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "id": "KU9NT5hNtXkG" }, "outputs": [ { "data": { "text/plain": [ "[('1072 LH', 6),\n", " ('2042 AD', 6),\n", " ('2051 EC', 6),\n", " ('1017 VN', 5),\n", " ('1017 BM', 5),\n", " ('1012 JS', 4),\n", " ('1017 CV', 4),\n", " ('1071 AA', 4),\n", " ('1017 DA', 4),\n", " ('2041 JA', 4),\n", " ('1976 GA', 4),\n", " ('1016 GB', 3),\n", " ('1072 CV', 3),\n", " ('1012 SJ', 3),\n", " ('1017 PX', 3),\n", " ('1017 NG', 3),\n", " ('1013 ES', 3),\n", " ('1012 CP', 3),\n", " ('1073 BM', 3),\n", " ('1092 BB', 3)]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Counter(zipcodes).most_common(20)" ] }, { "cell_type": "markdown", "metadata": { "id": "Q4JpFjRBtXkJ" }, "source": [ "For my little story, however, this data is too specific - the letters at the end of each zipcode make for too detailed a story. There is a way to cut off the letters and just use the four numbers of each zipcode. Again, here don't worry about knowing all this code, but, worry about understanding the logic here, and thinking how (eventually) you might want to apply it to your own datasets." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "id": "Q014vehYtXkK" }, "outputs": [ { "data": { "text/plain": [ "[('1017', 87),\n", " ('1012', 84),\n", " ('1072', 54),\n", " ('1016', 37),\n", " ('1015', 34),\n", " ('1013', 31),\n", " ('1071', 25),\n", " ('1018', 21),\n", " ('1073', 19),\n", " ('1053', 19),\n", " ('1091', 17),\n", " ('1011', 17),\n", " ('1054', 16),\n", " ('1052', 15),\n", " ('2011', 11),\n", " ('1075', 11),\n", " ('1074', 11),\n", " ('2042', 11),\n", " ('1092', 9),\n", " ('1078', 8)]" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zipcodes_without_letters = [z[0:4] for z in zipcodes]\n", "Counter(zipcodes_without_letters).most_common(20)" ] }, { "cell_type": "markdown", "metadata": { "id": "RqOM5TSgtXkS" }, "source": [ "## APIs\n", "\n", "Lastly, we will check out working with a JSON-based API. Some APIs that are very frequently used (e.g., the Twitter API) have an own Python *wrapper*, which means that you can do something like `import twitter` and have some user-friendly commands. Also, many APIs require authentication (i.e., sth like a username and a password).\n", "\n", "We do not want to require all of you to get such an account for the sole purpose of this meeting. We will therefore work with a public API provided by Statistics Netherlands (CBS): https://opendata.cbs.nl/.\n", "\n", "First, we go to https://opendata.cbs.nl/statline/portal.html?_la=en&_catalog=CBS and select a dataset. This kind of website is a great place to explore some potential datasets for your projects. If you explore a bit, you'll see there are a ton of datasets and a ton of APIs, as well as raw JSON files for you to download and work with. Take this illustration just as a way to use APIs if the raw data is not also available.\n", "\n", "If there is a specific URL we want to access (like this one we have chosen ahead of time), we can do so as follows:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "id": "YwhSr1patXkS" }, "outputs": [], "source": [ "data = requests.get('https://opendata.cbs.nl/ODataApi/odata/37556eng/TypedDataSet').json()" ] }, { "cell_type": "markdown", "metadata": { "id": "WiHXtoPxtXkX" }, "source": [ "Let's try some things out to make sense of this data:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "id": "_2yp-obJtXkY" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
odata.metadatavalue
0https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 0, 'Periods': '1899JJ00', 'TotalPopulat...
1https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 1, 'Periods': '1900JJ00', 'TotalPopulat...
2https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 2, 'Periods': '1901JJ00', 'TotalPopulat...
3https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 3, 'Periods': '1902JJ00', 'TotalPopulat...
4https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 4, 'Periods': '1903JJ00', 'TotalPopulat...
.........
116https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 116, 'Periods': '2015JJ00', 'TotalPopul...
117https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 117, 'Periods': '2016JJ00', 'TotalPopul...
118https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 118, 'Periods': '2017JJ00', 'TotalPopul...
119https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 119, 'Periods': '2018JJ00', 'TotalPopul...
120https://opendata.cbs.nl/ODataApi/OData/37556en...{'ID': 120, 'Periods': '2019JJ00', 'TotalPopul...
\n", "

121 rows × 2 columns

\n", "
" ], "text/plain": [ " odata.metadata \\\n", "0 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "1 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "2 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "3 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "4 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", ".. ... \n", "116 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "117 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "118 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "119 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "120 https://opendata.cbs.nl/ODataApi/OData/37556en... \n", "\n", " value \n", "0 {'ID': 0, 'Periods': '1899JJ00', 'TotalPopulat... \n", "1 {'ID': 1, 'Periods': '1900JJ00', 'TotalPopulat... \n", "2 {'ID': 2, 'Periods': '1901JJ00', 'TotalPopulat... \n", "3 {'ID': 3, 'Periods': '1902JJ00', 'TotalPopulat... \n", "4 {'ID': 4, 'Periods': '1903JJ00', 'TotalPopulat... \n", ".. ... \n", "116 {'ID': 116, 'Periods': '2015JJ00', 'TotalPopul... \n", "117 {'ID': 117, 'Periods': '2016JJ00', 'TotalPopul... \n", "118 {'ID': 118, 'Periods': '2017JJ00', 'TotalPopul... \n", "119 {'ID': 119, 'Periods': '2018JJ00', 'TotalPopul... \n", "120 {'ID': 120, 'Periods': '2019JJ00', 'TotalPopul... \n", "\n", "[121 rows x 2 columns]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(data)" ] }, { "cell_type": "markdown", "metadata": { "id": "40b_jGAltXkc" }, "source": [ "What that showed us is that there are 119 rows, with 2 columns. The first column seems only to be about metadata and URLs, which isn't very interesting. The second column looks like a series of dicts that might be more interesting for us. Let's confirm what these two columns are:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "GqWf0lvhtXkc" }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['odata.metadata', 'value'])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.keys()" ] }, { "cell_type": "markdown", "metadata": { "id": "40rHZju8tXkf" }, "source": [ "Now let's focus only on the 'value' column, and make a new dataframe out of that." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "ME5WLmsXtXkg" }, "outputs": [], "source": [ "df = pd.DataFrame(data['value'])" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "id": "Z1TykF77tXkm" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDPeriodsTotalPopulation_1Males_2Females_3TotalPopulation_4YoungerThan20Years_5k_20To44Years_6k_45To64Years_7k_65To79Years_8...Divorces_179DivorcesRelative_180AverageAgeMales_181AverageAgeFemales_182AverageDurationOfTheMarriage_183DueToDeathHusbandRelative_184DueToDeathWifeRelative_185BalanceOfChangesOfNationality_186Naturalization_187NaturalizationsRelative_188
001899JJ00NaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
111900JJ005104.02521.02583.05104.02264.01732.0802.0272.0...0.60.7NaNNaNNaN16.213.4NaNNaNNaN
221901JJ005163.02550.02613.05163.02286.01762.0806.0274.0...0.60.7NaNNaNNaN14.812.1NaNNaNNaN
331902JJ005233.02584.02649.05233.02314.01791.0812.0278.0...0.60.7NaNNaNNaN14.611.8NaNNaNNaN
441903JJ005307.02622.02685.05307.02344.01824.0819.0281.0...0.60.7NaNNaNNaN14.111.4NaNNaNNaN
..................................................................
1161162015JJ0016901.08373.08528.016901.03828.05311.04754.02273.0...34.210.146.943.714.811.65.728.022.025.3
1171172016JJ0016979.08417.08562.016979.03818.05284.04792.02337.0...33.49.947.344.015.011.75.828.022.023.0
1181182017JJ0017082.08475.08606.017082.03817.05281.04824.02395.0...32.89.747.444.215.111.75.928.020.019.5
1191192018JJ0017181.08527.08654.017181.03811.05291.04840.02460.0...30.79.247.644.315.011.96.128.021.019.2
1201202019JJ0017282.08581.08701.017282.03792.05335.04841.02515.0...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

121 rows × 190 columns

\n", "
" ], "text/plain": [ " ID Periods TotalPopulation_1 Males_2 Females_3 TotalPopulation_4 \\\n", "0 0 1899JJ00 NaN NaN NaN NaN \n", "1 1 1900JJ00 5104.0 2521.0 2583.0 5104.0 \n", "2 2 1901JJ00 5163.0 2550.0 2613.0 5163.0 \n", "3 3 1902JJ00 5233.0 2584.0 2649.0 5233.0 \n", "4 4 1903JJ00 5307.0 2622.0 2685.0 5307.0 \n", ".. ... ... ... ... ... ... \n", "116 116 2015JJ00 16901.0 8373.0 8528.0 16901.0 \n", "117 117 2016JJ00 16979.0 8417.0 8562.0 16979.0 \n", "118 118 2017JJ00 17082.0 8475.0 8606.0 17082.0 \n", "119 119 2018JJ00 17181.0 8527.0 8654.0 17181.0 \n", "120 120 2019JJ00 17282.0 8581.0 8701.0 17282.0 \n", "\n", " YoungerThan20Years_5 k_20To44Years_6 k_45To64Years_7 k_65To79Years_8 \\\n", "0 NaN NaN NaN NaN \n", "1 2264.0 1732.0 802.0 272.0 \n", "2 2286.0 1762.0 806.0 274.0 \n", "3 2314.0 1791.0 812.0 278.0 \n", "4 2344.0 1824.0 819.0 281.0 \n", ".. ... ... ... ... \n", "116 3828.0 5311.0 4754.0 2273.0 \n", "117 3818.0 5284.0 4792.0 2337.0 \n", "118 3817.0 5281.0 4824.0 2395.0 \n", "119 3811.0 5291.0 4840.0 2460.0 \n", "120 3792.0 5335.0 4841.0 2515.0 \n", "\n", " ... Divorces_179 DivorcesRelative_180 AverageAgeMales_181 \\\n", "0 ... NaN NaN NaN \n", "1 ... 0.6 0.7 NaN \n", "2 ... 0.6 0.7 NaN \n", "3 ... 0.6 0.7 NaN \n", "4 ... 0.6 0.7 NaN \n", ".. ... ... ... ... \n", "116 ... 34.2 10.1 46.9 \n", "117 ... 33.4 9.9 47.3 \n", "118 ... 32.8 9.7 47.4 \n", "119 ... 30.7 9.2 47.6 \n", "120 ... NaN NaN NaN \n", "\n", " AverageAgeFemales_182 AverageDurationOfTheMarriage_183 \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", ".. ... ... \n", "116 43.7 14.8 \n", "117 44.0 15.0 \n", "118 44.2 15.1 \n", "119 44.3 15.0 \n", "120 NaN NaN \n", "\n", " DueToDeathHusbandRelative_184 DueToDeathWifeRelative_185 \\\n", "0 NaN NaN \n", "1 16.2 13.4 \n", "2 14.8 12.1 \n", "3 14.6 11.8 \n", "4 14.1 11.4 \n", ".. ... ... \n", "116 11.6 5.7 \n", "117 11.7 5.8 \n", "118 11.7 5.9 \n", "119 11.9 6.1 \n", "120 NaN NaN \n", "\n", " BalanceOfChangesOfNationality_186 Naturalization_187 \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", ".. ... ... \n", "116 28.0 22.0 \n", "117 28.0 22.0 \n", "118 28.0 20.0 \n", "119 28.0 21.0 \n", "120 NaN NaN \n", "\n", " NaturalizationsRelative_188 \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", ".. ... \n", "116 25.3 \n", "117 23.0 \n", "118 19.5 \n", "119 19.2 \n", "120 NaN \n", "\n", "[121 rows x 190 columns]" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "markdown", "metadata": { "id": "M7jFB0FwtXkq" }, "source": [ "We can actually see that this is a list that works as a 'simple' dataframe--there are rows and columns, and it doesn't look like there is more nested info within here.\n", "\n", "But there are 199 columns! How can we know what's in this dataset then? We can create a list using the '.columns' property associated with a dataframe. This allows us to transform the index into a list to see everything in it:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "id": "_85GejhjtXkq" }, "outputs": [ { "data": { "text/plain": [ "['ID',\n", " 'Periods',\n", " 'TotalPopulation_1',\n", " 'Males_2',\n", " 'Females_3',\n", " 'TotalPopulation_4',\n", " 'YoungerThan20Years_5',\n", " 'k_20To44Years_6',\n", " 'k_45To64Years_7',\n", " 'k_65To79Years_8',\n", " 'k_80YearsOrOlder_9',\n", " 'GreenPressure_10',\n", " 'GreyPressure_11',\n", " 'TotalPopulation_12',\n", " 'NeverMarried_13',\n", " 'Married_14',\n", " 'Widowed_15',\n", " 'Divorced_16',\n", " 'TotalPopulation_17',\n", " 'NorthNetherlands_18',\n", " 'EastNetherlands_19',\n", " 'WestNetherlands_20',\n", " 'SouthNetherlands_21',\n", " 'TotalPopulation_22',\n", " 'LessThan5000Inhabitants_23',\n", " 'k_5000To19999Inhabitants_24',\n", " 'k_20000To49999Inhabitants_25',\n", " 'k_50000To99999Inhabitants_26',\n", " 'k_100000InhabitantsOrMore_27',\n", " 'TotalNumberOfMunicipalities_28',\n", " 'LessThan5000Inhabitants_29',\n", " 'k_5000To19999Inhabitants_30',\n", " 'k_20000To49999Inhabitants_31',\n", " 'k_50000To99999Inhabitants_32',\n", " 'k_100000InhabitantsOrMore_33',\n", " 'TotalForeignNationalities_34',\n", " 'American_35',\n", " 'Belgian_36',\n", " 'British_37',\n", " 'German_38',\n", " 'Italian_39',\n", " 'Moroccan_40',\n", " 'Spanish_41',\n", " 'Turkish_42',\n", " 'FormerYugoslavian_43',\n", " 'PersonsWithSurinameseBackground_44',\n", " 'PersonsWithAntilleanBackground_45',\n", " 'TotalPrivateHouseholds_46',\n", " 'MalesAndFemales_47',\n", " 'Males_48',\n", " 'Females_49',\n", " 'MultiPersonHouseholds_50',\n", " 'AverageHouseholdsize_51',\n", " 'TotalPersonsInPrivateHouseholds_52',\n", " 'ChildrenInPrivateHouseholds_53',\n", " 'LiveBornChildren_54',\n", " 'Deaths_55',\n", " 'NaturalIncrease_56',\n", " 'Immigration_57',\n", " 'EmigrationIncludingAdministrativeC_58',\n", " 'NetMigration_59',\n", " 'TotalGrowth_60',\n", " 'TotalGrowthRelative_61',\n", " 'LiveBornChildren_62',\n", " 'LiveBornChildrenRelative_63',\n", " 'SexRatio_64',\n", " 'AverageNumberOfChildrenPerFemale_65',\n", " 'TotalLiveBornChildren_66',\n", " 'YoungerThan20Years_67',\n", " 'k_20To24Years_68',\n", " 'k_25To29Years_69',\n", " 'k_30YearsOrOlder_70',\n", " 'k_1stChild_71',\n", " 'k_2ndChild_72',\n", " 'k_3rdChild_73',\n", " 'k_4thAndSubsequentChildren_74',\n", " 'LiveBornChildrenMotherNotMarried_75',\n", " 'Deaths_76',\n", " 'DeathsRelative_77',\n", " 'DeathsSexRatio_78',\n", " 'LifeExpectancyAtBirthMale_79',\n", " 'LifeExpectancyAtBirthFemale_80',\n", " 'k_28WeeksOrMoreRelative_81',\n", " 'k_24WeeksOrMoreRelative_82',\n", " 'PerinatalMortality24_83',\n", " 'PerinatalMortality24Relative_84',\n", " 'PerinatalMortality28_85',\n", " 'PerinatalMortality28Relative_86',\n", " 'Deaths4WeeksAfterBirth_87',\n", " 'Deaths4WeeksAfterBirthRelative_88',\n", " 'Deaths1YearAfterBirth_89',\n", " 'Deaths1YearAfterBirthRelative_90',\n", " 'k_1To4Years_91',\n", " 'k_5To14Years_92',\n", " 'k_15To44Years_93',\n", " 'k_45To64Years_94',\n", " 'k_65To79Years_95',\n", " 'k_80YearsOrOlder_96',\n", " 'PersonsMovedWithinMunicipalities_97',\n", " 'TotalPersons_98',\n", " 'TotalPersonsRelative_99',\n", " 'WithinTheSameProvinceRelative_100',\n", " 'FamiliesUntil2010Relative_101',\n", " 'TotalImmigration_102',\n", " 'Dutch_103',\n", " 'TotalNonDutch_104',\n", " 'EuropeanUnionExcludingDutch_105',\n", " 'Moroccan_106',\n", " 'Turkish_107',\n", " 'TotalEmigrationIncludingAdministra_108',\n", " 'Dutch_109',\n", " 'TotalNonDutch_110',\n", " 'EuropeanUnionExcludingDutch_111',\n", " 'Moroccan_112',\n", " 'Turkish_113',\n", " 'TotalEmigrationExcludingAdministra_114',\n", " 'Dutch_115',\n", " 'TotalNonDutch_116',\n", " 'EuropeanUnionExcludingDutch_117',\n", " 'Moroccan_118',\n", " 'Turkish_119',\n", " 'TotalImmigration_120',\n", " 'TheNetherlands_121',\n", " 'EuropeanUnionExcludingTheNetherl_122',\n", " 'Indonesia_123',\n", " 'SurinameAndTheNetherlandsAntilles_124',\n", " 'Suriname_125',\n", " 'TheFormerNetherlandsAntilles_126',\n", " 'Morocco_127',\n", " 'Turkey_128',\n", " 'SpecificEmigrationCountries_129',\n", " 'TotalEmigrationIncludingAdministra_130',\n", " 'TheNetherlands_131',\n", " 'EuropeanUnionExcludingTheNetherl_132',\n", " 'Indonesia_133',\n", " 'SurinameAndTheNetherlandsAntilles_134',\n", " 'Suriname_135',\n", " 'TheFormerNetherlandsAntilles_136',\n", " 'Morocco_137',\n", " 'Turkey_138',\n", " 'SpecificEmigrationCountries_139',\n", " 'TotalEmigrationExcludingAdministra_140',\n", " 'TheNetherlands_141',\n", " 'EuropeanUnionExcludingTheNetherl_142',\n", " 'Indonesia_143',\n", " 'SurinameAndTheNetherlandsAntilles_144',\n", " 'Suriname_145',\n", " 'TheFormerNetherlandsAntilles_146',\n", " 'Morocco_147',\n", " 'Turkey_148',\n", " 'SpecificEmigrationCountries_149',\n", " 'TotalImmigration_150',\n", " 'EuropeanUnionExcludingTheNetherl_151',\n", " 'IndonesiaSurinameTheNetherlan_152',\n", " 'SurinameAndTheNetherlandsAntilles_153',\n", " 'Indonesia_154',\n", " 'Suriname_155',\n", " 'TheFormerNetherlandsAntilles_156',\n", " 'Morocco_157',\n", " 'Turkey_158',\n", " 'SpecificEmigrationCountries_159',\n", " 'TotalEmigrationExcludingAdministra_160',\n", " 'EuropeanUnionExcludingTheNetherl_161',\n", " 'IndonesiaSurinameTheNetherlan_162',\n", " 'SurinameAndTheNetherlandsAntilles_163',\n", " 'Indonesia_164',\n", " 'Suriname_165',\n", " 'TheFormerNetherlandsAntilles_166',\n", " 'Morocco_167',\n", " 'Turkey_168',\n", " 'SpecificEmigrationCountries_169',\n", " 'Marriages_170',\n", " 'MarriagesPer1000Inhabitants_171',\n", " 'MarriagesPer1000UnmarriedMen_172',\n", " 'k_1stMarriageForBothPartnersRelative_173',\n", " 'AverageAgeMarryingMales_174',\n", " 'AverageAgeMarryingFemales_175',\n", " 'MarriageDissolutions_176',\n", " 'MarriageDissolutionsPer1000Inhab_177',\n", " 'MarriageDissolutionsPer1000Marri_178',\n", " 'Divorces_179',\n", " 'DivorcesRelative_180',\n", " 'AverageAgeMales_181',\n", " 'AverageAgeFemales_182',\n", " 'AverageDurationOfTheMarriage_183',\n", " 'DueToDeathHusbandRelative_184',\n", " 'DueToDeathWifeRelative_185',\n", " 'BalanceOfChangesOfNationality_186',\n", " 'Naturalization_187',\n", " 'NaturalizationsRelative_188']" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(df.columns) " ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "id": "87uvZlmwtXks" }, "outputs": [ { "data": { "text/plain": [ "Index(['ID', 'Periods', 'TotalPopulation_1', 'Males_2', 'Females_3',\n", " 'TotalPopulation_4', 'YoungerThan20Years_5', 'k_20To44Years_6',\n", " 'k_45To64Years_7', 'k_65To79Years_8',\n", " ...\n", " 'Divorces_179', 'DivorcesRelative_180', 'AverageAgeMales_181',\n", " 'AverageAgeFemales_182', 'AverageDurationOfTheMarriage_183',\n", " 'DueToDeathHusbandRelative_184', 'DueToDeathWifeRelative_185',\n", " 'BalanceOfChangesOfNationality_186', 'Naturalization_187',\n", " 'NaturalizationsRelative_188'],\n", " dtype='object', length=190)" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#two other ways to tell us ABOUT the columns are this, but these abbreviate the list of columns so we can't read it.\n", "df.columns\n", "df.keys()" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "id": "KXVHK9CttXky" }, "outputs": [ { "data": { "text/plain": [ "0 1899JJ00\n", "1 1900JJ00\n", "2 1901JJ00\n", "3 1902JJ00\n", "4 1903JJ00\n", " ... \n", "116 2015JJ00\n", "117 2016JJ00\n", "118 2017JJ00\n", "119 2018JJ00\n", "120 2019JJ00\n", "Name: Periods, Length: 121, dtype: object" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# So let's choose one column specifically - 'Periods' and figure out more about it.\n", "# What do you think this represents? What would we need to do to make sense of this/make it useful?\n", "df['Periods']" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "id": "s_IQvus_tXlN" }, "outputs": [], "source": [ "# It would be really nice if our row numbers ('index') wouldn't be a number between 0 and 118, would \n", "# correspond to this value of 'periods'. But we need to clean up 'periods' to get just the first four characters\n", "# and to turn those from string (text) values into an integer (number). Here is the command - again, focus on \n", "# the logic, not the complexity of it. '.map' is a command, and lamda is a function, and 'x' is an arbitrary label.\n", "df.index = df['Periods'].map(lambda x: int(x[:4]))" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "id": "Hzg99voJtXlR" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDPeriodsTotalPopulation_1Males_2Females_3TotalPopulation_4YoungerThan20Years_5k_20To44Years_6k_45To64Years_7k_65To79Years_8...Divorces_179DivorcesRelative_180AverageAgeMales_181AverageAgeFemales_182AverageDurationOfTheMarriage_183DueToDeathHusbandRelative_184DueToDeathWifeRelative_185BalanceOfChangesOfNationality_186Naturalization_187NaturalizationsRelative_188
Periods
189901899JJ00NaNNaNNaNNaNNaNNaNNaNNaN...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
190011900JJ005104.02521.02583.05104.02264.01732.0802.0272.0...0.60.7NaNNaNNaN16.213.4NaNNaNNaN
190121901JJ005163.02550.02613.05163.02286.01762.0806.0274.0...0.60.7NaNNaNNaN14.812.1NaNNaNNaN
190231902JJ005233.02584.02649.05233.02314.01791.0812.0278.0...0.60.7NaNNaNNaN14.611.8NaNNaNNaN
190341903JJ005307.02622.02685.05307.02344.01824.0819.0281.0...0.60.7NaNNaNNaN14.111.4NaNNaNNaN
..................................................................
20151162015JJ0016901.08373.08528.016901.03828.05311.04754.02273.0...34.210.146.943.714.811.65.728.022.025.3
20161172016JJ0016979.08417.08562.016979.03818.05284.04792.02337.0...33.49.947.344.015.011.75.828.022.023.0
20171182017JJ0017082.08475.08606.017082.03817.05281.04824.02395.0...32.89.747.444.215.111.75.928.020.019.5
20181192018JJ0017181.08527.08654.017181.03811.05291.04840.02460.0...30.79.247.644.315.011.96.128.021.019.2
20191202019JJ0017282.08581.08701.017282.03792.05335.04841.02515.0...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

121 rows × 190 columns

\n", "
" ], "text/plain": [ " ID Periods TotalPopulation_1 Males_2 Females_3 \\\n", "Periods \n", "1899 0 1899JJ00 NaN NaN NaN \n", "1900 1 1900JJ00 5104.0 2521.0 2583.0 \n", "1901 2 1901JJ00 5163.0 2550.0 2613.0 \n", "1902 3 1902JJ00 5233.0 2584.0 2649.0 \n", "1903 4 1903JJ00 5307.0 2622.0 2685.0 \n", "... ... ... ... ... ... \n", "2015 116 2015JJ00 16901.0 8373.0 8528.0 \n", "2016 117 2016JJ00 16979.0 8417.0 8562.0 \n", "2017 118 2017JJ00 17082.0 8475.0 8606.0 \n", "2018 119 2018JJ00 17181.0 8527.0 8654.0 \n", "2019 120 2019JJ00 17282.0 8581.0 8701.0 \n", "\n", " TotalPopulation_4 YoungerThan20Years_5 k_20To44Years_6 \\\n", "Periods \n", "1899 NaN NaN NaN \n", "1900 5104.0 2264.0 1732.0 \n", "1901 5163.0 2286.0 1762.0 \n", "1902 5233.0 2314.0 1791.0 \n", "1903 5307.0 2344.0 1824.0 \n", "... ... ... ... \n", "2015 16901.0 3828.0 5311.0 \n", "2016 16979.0 3818.0 5284.0 \n", "2017 17082.0 3817.0 5281.0 \n", "2018 17181.0 3811.0 5291.0 \n", "2019 17282.0 3792.0 5335.0 \n", "\n", " k_45To64Years_7 k_65To79Years_8 ... Divorces_179 \\\n", "Periods ... \n", "1899 NaN NaN ... NaN \n", "1900 802.0 272.0 ... 0.6 \n", "1901 806.0 274.0 ... 0.6 \n", "1902 812.0 278.0 ... 0.6 \n", "1903 819.0 281.0 ... 0.6 \n", "... ... ... ... ... \n", "2015 4754.0 2273.0 ... 34.2 \n", "2016 4792.0 2337.0 ... 33.4 \n", "2017 4824.0 2395.0 ... 32.8 \n", "2018 4840.0 2460.0 ... 30.7 \n", "2019 4841.0 2515.0 ... NaN \n", "\n", " DivorcesRelative_180 AverageAgeMales_181 AverageAgeFemales_182 \\\n", "Periods \n", "1899 NaN NaN NaN \n", "1900 0.7 NaN NaN \n", "1901 0.7 NaN NaN \n", "1902 0.7 NaN NaN \n", "1903 0.7 NaN NaN \n", "... ... ... ... \n", "2015 10.1 46.9 43.7 \n", "2016 9.9 47.3 44.0 \n", "2017 9.7 47.4 44.2 \n", "2018 9.2 47.6 44.3 \n", "2019 NaN NaN NaN \n", "\n", " AverageDurationOfTheMarriage_183 DueToDeathHusbandRelative_184 \\\n", "Periods \n", "1899 NaN NaN \n", "1900 NaN 16.2 \n", "1901 NaN 14.8 \n", "1902 NaN 14.6 \n", "1903 NaN 14.1 \n", "... ... ... \n", "2015 14.8 11.6 \n", "2016 15.0 11.7 \n", "2017 15.1 11.7 \n", "2018 15.0 11.9 \n", "2019 NaN NaN \n", "\n", " DueToDeathWifeRelative_185 BalanceOfChangesOfNationality_186 \\\n", "Periods \n", "1899 NaN NaN \n", "1900 13.4 NaN \n", "1901 12.1 NaN \n", "1902 11.8 NaN \n", "1903 11.4 NaN \n", "... ... ... \n", "2015 5.7 28.0 \n", "2016 5.8 28.0 \n", "2017 5.9 28.0 \n", "2018 6.1 28.0 \n", "2019 NaN NaN \n", "\n", " Naturalization_187 NaturalizationsRelative_188 \n", "Periods \n", "1899 NaN NaN \n", "1900 NaN NaN \n", "1901 NaN NaN \n", "1902 NaN NaN \n", "1903 NaN NaN \n", "... ... ... \n", "2015 22.0 25.3 \n", "2016 22.0 23.0 \n", "2017 20.0 19.5 \n", "2018 21.0 19.2 \n", "2019 NaN NaN \n", "\n", "[121 rows x 190 columns]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Now let's check our work:\n", "df" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "id": "Nwia7AgvtXlX" }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEGCAYAAABrQF4qAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAA6BklEQVR4nO3deXib1ZX48e+VZEm2vK9x7Di7s5A9IWwl0AIlLMMWoNCNtnSh7Uw77bQFWjrdhg4tnW7TKS2/UsrQltISKDAQtlBIWkJC4iwkcWI7ix3H+27LsWRJ9/eHJEe25U3WZvl8nofH9qtX0r1B7/H1ee89V2mtEUIIkbgMsW6AEEKIyJJAL4QQCU4CvRBCJDgJ9EIIkeAk0AshRIIzxboBALm5uXrOnDmxboYQQkwpe/bsadFa5411XlwE+jlz5rB79+5YN0MIIaYUpVT1eM6T1I0QQiQ4CfRCCJHgJNALIUSCk0AvhBAJTgK9EEIkOAn0QgiR4CTQCyFEgpNAL0QcKa/v4s2K5lg3QyQYCfRCxJEHXz7Kpx7bzam23lg3RSQQCfRCxJHKpm6cbg8/fPlorJsiEogEeiHiRF+/m9r2M2TbzDy/v459pzpi3SSRICTQCxEnjjfb0Rq+duUiclPNfP+FcmSrTxEOEuiFiBPHmnsAWDkrky9dUcquk2387WhTjFslEoEEeiHiRFVTD0rB3FwbH1g3i8yUJF58tyHWzRIJYMxAr5T6rVKqSSl1MODYg0qpI0qpA0qpZ5RSmQGP3auUqlJKHVVKXRmhdguRcI419zArKwVrkhGT0cBFC3LZXtks6RsxaeMZ0f8O2Djk2KvAMq31CqACuBdAKbUUuA04x/ecXyqljGFrrRAJrKqphwX5qQM/X7Iwj8YuB0cbu2PYKpEIxgz0WuttQNuQY69orV2+H98Gin3fXw/8SWvt0FqfAKqA9WFsrxAJye3RHG+xMz/PNnDs4tJcALbJAioxSeHI0X8C2OL7vgg4FfBYre+YEGIUp9vP4HR5Bo3oCzOSKS1IZVtFSwxbJhLBpAK9UuobgAv4g/9QkNOCJhiVUp9WSu1WSu1ubpYRi5jeqpq96Zn5eamDjm9YmMeuE230Ol3BnibEuIQc6JVSdwDXAh/SZ+8W1QKzAk4rBuqCPV9r/bDWep3Wel1e3ph72wqR0I412YEggb40D6fbw87jbcGeJsS4hBTolVIbgbuB67TWgUU5ngNuU0pZlFJzgYXArsk3U4jEVtXUQ47NTJbNPOj4+rnZWEwGKXQmJsU01glKqSeAS4FcpVQt8C28s2wswKtKKYC3tdZ3aa0PKaX+DBzGm9L5vNbaHanGCzGVvVnRTJJBceGCXI419zA/P3XYOdYkI+fNy2FbpQR6EboxA73W+vYghx8Z5fz7gfsn0yghEl1fv5t//kMZPU4X37t+GVXNPVy1rDDouefNzWZbRTN2hwubZcxLVohhZGWsEDHw8qEGuh0uFs9I576/HqSjt3/QjJtARZnJANR39kWziSKBSKAXIgaeLjtNUWYyz37+Im5d512GsrwoI+i5hRlWABok0IsQyd+BQkRZY1cf2yub+dylCzCbDPxg0wo+d+kC5uTagp5fmOEd0dd1nolmM0UCkRG9EFH2172n8Wi4cY13LaFSasQgD1CQYQFkRC9CJ4FeiCjSWrO5rJbVJZnD5syPxGIykptqlhy9CJkEeiGi6FBdFxWNPWxaUzz2yQFmZFipl9SNCJEEeiGi6Kk9tZiNBq5dEXwq5UhmpCdL6kaETAK9EFHS7/bw3P46Ll+aT2aKeewnBJiZaZXUjQiZBHohouSNo8202Z3ctHpiaRvwpm46z/RLcTMREgn0QkTJ02W15NjMXLJo4kX8ZmbIoikROgn0QkRBR6+TreVNXL+qiCTjxC+7GbJoSkyCBHohouD5A/U43R5uWhPaPjz+1bF1HTLzRkycBHohomDznloWz0jjnJnpIT2/IF1G9CJ0EuiFiLBjzT3sO9XBTWuK8JX1njBrkpEcm5k6CfQiBBLohYiwp8tqMSi4YdXktk+ekWGlQRZNiRBIoBcigjwezTNlp9lQmke+L/0SqsKMZJl1I0IigV6ICNpxvJW6zr4JlzwIpjBDFk2J0EigFyKCNu+pJc1q4oqlBZN+rcBFUzuOtXL5j9+k3e4MQytFopNAL0SE2B0uthxs4NoVhViTjJN+vZmZZ6dYfvu5Q1Q19XC0sXvSrysSnwR6ISJky8EGzvS7w5K2AW9hM4Cfb60aCPAy3VKMhwR6ISLkraoW8tIsrJ2dFZbX84/on9tfx7Ii73x82XVKjIcEeiEi5HB9F8tmpoc8d36ogoBZO9+5bhkZyUkyohfjInvGChEBDpebqqYe3rc4P2yvaU0yMis7mZXFmaydnUVhhpW6Dgn0YmwS6IWIgKqmHlwezZLC0EoejOTZz7+HVIv3sp2RYaWhS1I3YmySuhFx6S+7T/GRR3bGuhkhK6/33iwNd6DPtpkxm7yXbWGG7DolxkcCvYhLB2o72V7ZQndff6ybEpLy+i6sSQbm5toi9h6FGVZaepw4XO6IvYdIDBLoRVzyB6/q1t4YtyQ0h+u6WDQjHaMhPDdig/GXLm7sdETsPURikEAv4pLD5QGgpm3qBXqtNeUNXSwtTIvo+xT6dp2SKZZiLBLoRVxy9HsD/clWe4xbMnENXX109PazNMz5+aFk1ykxXhLoRVzyp25qpmDq5nBdFxD+G7FD+VM3UuhMjEUCvYhL/tTNVBzRl9d7A/3iCAd6m8VEutVEvaRuxBgk0Iu4NJCjn4Ij+vL6bkqyUwbmu0fSzEypUS/GJoFexCV/6qa+q4++/qk1ffBwfVfE8/N+MzKsMqIXY5JAL+JSX78HgwKtobZ96ozqe50uTrbaI56f9yvMsMrNWDEmCfQiLjlcbmbneBcbnWyZOoH+eLMdraG0IDUq71eYkSyLpsSYJNCLuOTo9wwEy+opNJfev8DL/0sq0mbIoikxDhLoRVxyuDzMSLeSZjFRPYVm3vhnCc3OSYnK+830LZqSPL0YjQR6EZccLjfWJCMlOSlTqgxCTWsvuakWbFGYcQNnR/Qy80aMZsxAr5T6rVKqSSl1MOBYtlLqVaVUpe9rVsBj9yqlqpRSR5VSV0aq4SJxaa1xuDxYTAbm5Nim3Ih+TpRG8yCLpsT4jGdE/ztg45Bj9wBbtdYLga2+n1FKLQVuA87xPeeXSqnJ74osppV+t0ZrsPhG9LXtZ3C5PbFu1rjUtPVSEsVA71801SCpGzGKMQO91nob0Dbk8PXAY77vHwNuCDj+J621Q2t9AqgC1oenqWK68M8g8Y7oU3B59JTYSamv3019Zx9zonQj1q8wI5nTU+DfR8ROqDn6Aq11PYDvq3+/tCLgVMB5tb5jwyilPq2U2q2U2t3c3BxiM0Qi6vMVNLOYDJRke4NmdVv8p29Otfln3ERvRA9wzsx0dp1olSmWYkThvhkbrPi2Dnai1vphrfU6rfW6vLy8MDdDTGVnR/RG5uR6g+bJKXBD9mSUp1b63bC6iK4+F1vLm6L6vmLqCDXQNyqlCgF8X/2fsFpgVsB5xUBd6M0T05G/zo0lyUBBmhWz0UDtFJhL779pPDs7uiP6ixbkUpBu4emy2qi+r5g6Qg30zwF3+L6/A3g24PhtSimLUmousBDYNbkmiunGMZC6MWIwKPLTLTR2xX8Ourq1l3SricyUpKi+r9GguGF1EW8cbaalRxZOieHGM73yCWAHsEgpVauUuhN4ALhCKVUJXOH7Ga31IeDPwGHgJeDzWmtJHIoJGUjdJHk/ngXpVhq74j+AVbf1MjvHhlKR2z5wJJvWFOPyaJ7dJ39Ai+HGXNWhtb59hIcuG+H8+4H7J9MoMb0NpG5M/kBv4WhDdyybNC7VrXaWF2XE5L1LC9JYXpTB02W13PmeuTFpg4hfsjJWxJ2zgd67BCM/zUpTnI/o+90eTrefifqMm0Cb1hRxqK6LIw1dMWuDiE8S6EXccfSfnUcPkJ9uodvhwu5wxbJZo6rrOIPLo6M+4ybQNStmAvDa4caYtUHEJwn0Iu70+Ub0Vn+OPs27zL+pO35H9QNVK6M84yZQXpqFZUXpbKtoiVkbhiqraaeztz/WzZj2JNCLuHN2RO9N3RSk+wJ9HM+88U+tnJMbuxE9wIaFeZTVtNPdF/vg2tTVx6aH3uLGh/4xpTaPSUQS6EXcCXYzFqAxzkf01iQD+WmWmLZjQ2keLo/mrWOtMW0HwKG6LrT2VvTc9NBbcu8ghiTQi7hzdsGU72bsFBjRHzjdyezs2EytDLSmJAub2ci2itiXFTlc7w3sT37mfBSKW361g53HY/8LaDqSQC/iTmBRM4B0qwmLyRC3i6a2Vzaz60QbN68tjnVTMJsMXDA/l22VzWgdtPpI1Byu72JWdjJrZ2ez+XMXkp9m4SO/3cWWd+tj2q7pSAK9iDuO/sGpG6UUBenWiN6M7eh18o1n3h0oTDZebo/m/hfKmZWdzEcvnB2h1k3MJaW5nGo7E/P6QOX1XSyZ4d0kvSgzmafuupBlM9P53B/LeFGCfVRJoBdxx+HyYDYZBqVBCiJcBuFnWyv5w84avvP84Qk9b3NZLUcaurl74+KBm8extqHUWyQwlumbXqeLEy12ls5MHziWZTPzh0+eT3FWstTliTIJ9CLuOFzugdG8X3565BZNnWix8/iOagrSLbxW3siOMW5k7qlu4/EdJ3l8x0n+65WjrJqVyTXLCyPStlDMzrExOyclpoH+aEM3WsOSwvRBx5PNRlbPyuJwndyYjSYJ9CLu9PV7ho2OC9KsERvR//ClI5hNBp6660JmZlj5/ovleDzB89sut4dP/G4333z2EN989hBdZ1x889qlMb8JO9SGhXnsOB67GvXl9d6SFUuHBHqApTPTqevso6PXGe1mTVsS6EXcCT6it2B3uukJ8+rY3Sfb2HKwgbsumc+s7BS+unER757u5Ln9wYuD7a/tpPNMPz/YtJzd913O7vsuZ+3srKDnxtKG0jx6nW72VLfH5P0P13eSZjFRnJU87DH/KN//y0BEngR6EXccLs9A5Uo//1z6cE+x/K9XKihIt/DJi72FwK5fWcSyonR+8lpF0Fkr2yqaUQrev3QGuakWbJYx6wLGxAXzczAZVMxWyZbXd7OkMD3oXzpLCtOAs9MvReRJoBdxx9HvwRokdQOEtVzxqbZedhxv5aMXzCHF7A3YBoPitnNLqG7t5Vjz8O0Lt1U2s6I4kyybOWztiIRUi4m1s7Nikqf3eDRH6rsGAvpQ+WlWclMtlEugjxoJ9CLuOFzuYSP6gUVT3eEb0T9ddhqlvFvxBbpkhFkrnb397D/VwSULc8PWhkjaUJrH4foumqO8orimrRe70z1oxs1QSwrTJNBHkQR6EXccLs+wHP1AGYQwpW601jy9t5YL5uVQlDk4jzwrO4W5uTa2VQ4O9H+vasGjz05fjHf+X1jbK6M7qvcH8KEzbgItLUynsrGHfrcnWs2a1iTQi7jjDfSDUzepFhMpZuOoUyxPttjZd6pjXO+xu7qd6tZeNq0Jvpp1w8Jc3j7eSl//2Vkr2yqaSbOaWDUrc1zvEWtLC9PJsZmDpm/21rRTE6EFVYfruzAo72YoI7ZtZjpOt4djzT0RaYMYTAK9iDuO/uGzbpRS5KdZRi1s9h8vlPPRR3aOa0rh5j21pJiNbFw2I+jjlyzKo6/fw+6T3lkrWmu2VzZz0fxcTMapcdkYDIqLF+ayvbJl0HRRt0fz8d+9w33PHgz7e/a7Pbzwbj1LZ6ZjTRp5AdnZmTeSvomGqfGJFdOK0+UZKGgWKD999Ln0FY3ddPW52FreNOrr9/W7eeFAPVctKxxx1sz583IwGw0D6ZtjzT3UdfZNmbSN34bSPFrtzkEzXPbXdtDR28/OIX+xhMOfdtVwvNnOFy8rHfW8ebk2zCbDpKZYOl2emNfzmSok0Iu40xdkRA/euvQjTa/s63dzylfzfPOekZfXezyaB7YcodvhYtOaohHPSzGbWDfHO2vF6fLwk1crAdhQOjVuxPpdvND7i+n1I2d/+flTOQ6Xh10n2sL2Xl19/fzktUrOm5vN5UvyRz3XZDRQWpAa8gpZh8vNNT/fzo2/fIvWnvgtXx0vJNCLuBPsZixAQZqFxi5H0FHciRY7WsPsnBTeqGimJcjF73C5+eKT+/jdWyf5+EVzuGB+zqjt2FCax5GGbj70m7d54d167rlqMcVZsdtBKhR5aRbOm5vNM3tPD/y7batoZvGMNMwmQ1inX/7qjWO02Z1845ol41opvLQwnfL6rpBG5Y/vqKayqYdDdZ3c/KsdEy5GN91IoBdR9fOtlbx0sGHUc4LdjAXviP5Mv5vuIKtj/Tf1vnxFKW6P5tl9g1e2aq357O/LeH5/HXdvXMy/j6NswQbfaLispoMHb17BXZfMH/X8eLVpbTEnWuzsPdVBZ28/+0518P6lBayfkz1sZlGoDtV18sjfT3DDqpmsKM4c13OWFKbTandOuCppZ28///16FRcvzOWJT51Pm93JTQ/JyH40EuhFVP3vjmpeGKNErcPlHtgvNlBJjnc0fTzIQqaqph6UgivPmcHyooxh6ZstBxt4/UgT912zhM9eOn9cI84lhWl87tL5PPqxc7ll3awxz49XVy2bgTXJwOY9tfzj2NkpohtKc6lo7KG+88ykXn/HsVZu+/XbZNvMfG3j4nE/b3lRBgD7xzlTyu8Xf6ukq6+fr1+9hHVzsnn4I2tp7nbExa5a8UoCvYgqu8NF7yj1atweTb9bBx3R+wtkBcvrHmu2U5yVjDXJyKY1RRyu7+JQXSfgvWn3wJYjLCpI4+MXzR13W5VSfG3j4il3A3aoNGsSG8+ZwfP763jtcOPAFFF/v7ZPsExC55l+frP9OL968xg/evkod/x2FzMyrDz9uQuZmTm8ts1IlhVlkGRUlNV0jPs5p9p6eeytam5ZWzwwc2dVSSYmg5IZPKOQQC+ixu3RnOl3Y3eOHOidA9sIDv9oFmclk2YxBb2gq5p6mJ+XCsB1q4pIs5j49P/u4VhzD4+/XU1NWy9fv2YJRkN8VZmMlk1ri+nqc/HMvtMDU0QXFaRRkG7hzQmmb/6ws5r/eKGcB7Yc4Rd/q2J1SSZ/uesCCjPGH+QBrElGls7MoKxm/IXXfvDSEQwG+PIViwaOWUxGFuSnSqAfRXxWZBIJyR/g7Y6Rp/QN3UYwkFKKJb4beIE8Hs3x5h4u8t1czbaZ+eOnzudjj+7i5ofewqPh4oW5AytFp6ML5+cyI91KQ9fZKaJKKS5emMerhxtxe/S4fwm+ebSZJYXpPP3ZCwGwJhlCLtO8elYmf3qnBpfbM+b6hLKadv7vQD1fuGwhMzKsgx5bWpjOP44N/suk1+kiOckYdyWkY0FG9CJq7L6UzWgj+oGNwUfYrclfIyVwAdDpjjM4XB7m56cOHFtenMHmz15ImjWJrr5+7r1qSTi6MGUZDYob1xSh1OApopeU5tF5pp+/V40vfdPjcLGnup1LSvNINhtJNk8ukK6ZnUVfv4cjDaPPp9da8/0XyslNtfCZDfOGPb6kMJ3GLgdtdm+N+4bOPtZ+7zVePtQYctsSiQR6ETX+QN87yojev4An2IgevBe03Xl2zjxAlW/GzYKAQA8wJ9fGc/98Ec9+/qJRC2xNF19430KeuuuCQVNEr1haQHFWMg9sOYJ7hM1WAu041orLo8O2nmBNSSbAmOmblw81sLu6nX97f2nQRW7+/7/+v/beONrEmX43O0/IDVqQQC+iqMcX4O2j3Ix1jJKjh+EXNMCxJl+gz0sddn5minnc0/0SXbLZyNrZ2YOOWZOMfG3jYsrru3hm7+kxX2NbRTMpZmPYNlspykwmP81CWZANUp7bX8c1P9/O1T/bzteeOkBpQSq3rA1em2jJkBv1/mmjkrf3kkAvoiYwdTPSIhlH/+ipm9KCNAxq8MybY809ZNvMcV8jPl7904pCVs7K5EcvH+WMc/SSCNsqmzl/Xk7YNkJXSrGmJCvozJtH/3GCxi4HMzOTuWhBLj++ddWIefxsm5mCdG+Ne5fbw98rvamow3WhLchKNBLoRdT4A71Hnx25DzXazVjwjkDn5aVyOKBGSlVTT9DRvBgfpRTfuHoJDV19/Gb78RHPq261U93ay4Yw1+NfXZJJTVvvoNXM/tr/H1w/i9/csY6HPryWZb559yNZUpjO4fou9td20tXn4vx52XT1uajrjMxew1OJBHoRNYE3YUfa+9X/C2C0yodLh8y8OdZsZ36+LUytnJ7Wz83m4oW5PFU2cp0gf7mEcK8rWONLA+0NGNWHUvt/aWE6VU09bC1vRCn41MXem7blIdbTSSQS6EXU9ATchB3phuxYI3rwjtxOd5yhs7efNruTNrtzYA69CN25c7Kpbu0d8ZfwmxUtFGclMzc3vL9UlxdlYDKoQTdkQ6n9v6QwHZdH88SuGlYUZ3LevByUkr1pQQK9iKLAm7Ajjuj7R78ZC4M3l354mzfVsGjGyJtciPHxrzw+2jA8ML5xtIltlc1cuigv7PPSrUlGzinK4G9HmvB4NFprtoVQ+99/Q7a9t59LFuaSajExOztFbsgigV5EUWCg7x1hLv1Y8+jh7Mybuzcf4FdvHuMD62Zx0fypVT44Hi3x/bseHlIj/umyWj752G4W5KWOWWc+VJ+4aA5HGrp5dv9pqpp6qA+h9v/cXNtAjST/c/15++lOAr2ImsBRvH2E2R1jzaMHyE+zkptqpqatl3953wIe2LQcwzQtbRBOMzOspFtNg2Y0Pbe/ji//eT/r52bz5GfOJy/NEpH3/qcVM1lRnMGDLx3llcPeRU4TnatvNCgWzUgflPJZWpg+ajpqupASCCJqAkf0I82lPzuiH30Mct81S1EKrl818uYhYmKClZh48p0a5uXaePTj54ZtSmUwBoPi61cv4baH3+ZnWyuZl2cLqfb/F963gPbe/oGUz5KAdNTQNQTTyaRG9EqpLymlDimlDiqlnlBKWZVS2UqpV5VSlb6v4VlZIaaEY809A3OYh7I73Jh9AXzkQO8b0Y8y6wbghtVFEuQjYOnMdI42dOP2aHqdLt450c77FudHNMj7nT8vhyuWFuB0eQb2Apioy5YUcHPAoqqBdNQ0n3kTcqBXShUBXwDWaa2XAUbgNuAeYKvWeiGw1fezmCb+529VfOnP+4I+1uNwke/70793hNTN2QVTklWMhSWF6Zzpd3Oy1c7O42043Z6olmm+56rFFGZY+aeVM8PyegPpqEnsTZsIJns1mYBkpZQJSAHqgOuBx3yPPwbcMMn3EFNIa493uqMnSN2UXufZQD/aPHqjQZE0gdkWInz8M2/K67t4s6IZi8nA+rnRS3nMz0tlx72Xha3EglKKpTOHVzydbkK+mrTWp4EfATVAPdCptX4FKNBa1/vOqQdG3yVYJJQ2uxO3R9PV1z/ssR6Hm2ybBaNBjTLrJvjG4CI6FuSnDmzi4S93MNritalgSaE3HRVs8DFdTCZ1k4V39D4XmAnYlFIfnsDzP62U2q2U2t3cHL4NikVs+cvE+r8GsjtcpFqM2MzGEWvSj7QxuIgOa5KR+XmpvHa4iePN9im/uxbAwvw0zvS7Od0xuS0Tp7LJXFGXAye01s1a637gaeBCoFEpVQjg+9oU7Mla64e11uu01uvy8qb+h0l4tfc6B30NZHe4sFlM2CymkW/G9gffGFxEz5LCNI42enPal4SpHHEszc/zruT1byA/HU0m0NcA5yulUpR3qdxlQDnwHHCH75w7gGcn10QxVfT1uwdusrbZg6VuvIE+xWwc8WZsn8s96qpYEXn+KYkzM6wJUVrCv09BVdP0DfQhz6PXWu9USj0FlAEuYC/wMJAK/FkpdSfeXwa3hKOhIv4Fpmva7I5Bj7ncHhwuDzaziVSLacRdprwjegn0seRfebyhNPzlDmIhJ9VCVkoSx5rtsW5KzExqwZTW+lvAt4YcduAd3YtpZnCgHzyi9+fkbRYjKeZRUjcut6RuYmxFcSbz82zcuDpx1inMz0sd2KBmOpKVsSJsAvPyQ3P0Pb4RfKrFhM1ipK5jeGoH5GZsPMhITmLrv10a62aE1YL8VF49PH33j5UrSoTN4BH94EDvH8H7b8aOVtRMcvQi3ObnpdJqd9IeZDbYdCBXlAgbf3AvykweMdCnWkykmE2DatMHcrjcWCV1I8Js4IbsNJ15I4FehE273YlBweyclCCB3p+jN2EzG0ce0ffLiF6En3/20HTN08sVJcKmrddJZoqZ3FTL8Bz9QOrG6EvduIOuVPTm6GVEL8KrKCsZi8kwbadYSqAXYdNmd5JtM5NtM9PWM3LqxmbxBvLe/uHpm75+KYEgws9oUMzLSx33oql9pzp48d36QcdqWnv5w87qSDQv4mTWjQibNruT7BRvoO92uHC6PGfLEjvP3oxNMXs/dr0OF6mWwR9BmXUjImV+no39tR3jOve+v75LeX03pQVpLMhPRWvNV57az64TbayelTWw1mCqkCtKhE27vZ8sWxJZNjMAHQHpm4HUjW/BFATfZcrhco9Zi16IUCzIT6W2/czALmYjOdrQzcHTXbg9mge2HAHglcON7DrRBsDmstqItzXcJNCLsGnrdZJts5DjC/RtAYHe7nBhUGBNMpBiNg4cC6S1lhG9iJj5ealoDcfHWCH7dFktJoPik++Zy2vljWyvbOaBLUeYn2fj8iUFPLvvNC63J0qtDg+5okRYaK1ptzvJtiWRleIL9D2Bgd6NzWJCKYXNP6IfEuj73RqtZdMRERn+KZaj5eldbg/P7D3NpYvy+cqVi5iZYeUzj+/hRIudr1+9hFvXFdPS42Rb5dSquCtXlAiLrj4XLo8my5ejh+Ejen/Kxh/ohxY2828jONXrn4v4NDfXhlLwtyNNaB28Nv3fq1po6nZw89oirElGvrpxEb1ONxfMy+F9i/O5dFE+2TYzm/ecjnLrJ0duxoqw8K84zLaZybIlDToG3pux/gBv86Vuhu4yNd6NwYUIhTXJyMcvnMtv/3ECpRQPbFo+bCezzWWnyUhO4r2LvfslXb+yiI7efq5YWoBSCrNJcd3KmfxxVw2dvf1kpCTFoisTJoFehIV/9J5tM59N3QQUNuvxpW4AUgZG9CMFehnRi8j45rVLSE828dPXKqlpsw+UZPZ75VADt66bNfAZNBgUH79o7qBzNq0p5ndvneT5A3V8+PzZUWv7ZEigF2Hhz8dn28wkGQ2kW02DShX7d5cCSDX7c/SDUzdnfIFfVsaKSFFK8a+Xl1KQbuUnr1YMW0CVlWLmQ+eXjPoay4rSOWdmOg9vO84t64qnxMBEAr0IC/+I3j+az7aZaes9O6K3O1zk2FIASLEEn3XjD/xD59YLEW63ry/h9vWjB/SRKKW4e+NiPvrbXTy+o5pPXjwvzK0LPxk6ibDw5+NzUs8G+sAcfU/AzdgkowGzyTBsHn1ghUsh4tmG0jw2lObx369XDVovEq8k0IuwaLM7sZgMJPtmzGTbzLTaB8+6CQzgwQqb9QSUSRAi3n396sV09/Xzi9erYt2UMUmgF2Hhr3Pj33ouK2XwiN7ucA+kbABfqeIhqRtf4PcvqBIini2ekc4ta2fx2I6TNHb1xbo5o5JAL8Kivdc5kJ8Hf47eidYap8uD0+0ZuAkL3lF775CbsT2SoxdTzEcumE2/W/P28dZYN2VUEuhFWLTZnQP5efAGeqfLQ6/TPZCiCUzdpFiMwzYIlxy9mGoWz0gjxWxkb01HrJsyKgn0Iiza7INH9P7CZm12Z9Dcuy3IBuF2hwulJHUjpg6T0cCK4gzKatpj3ZRRSaAXYeHP0ftlp5wN9IG7S/nZLMZhJRB6HC5sZtNAnl+IqWB1SRaH67rGrIoZSxLoxaT1uz109bmCj+h7nYN2l/KzmU1BUzeB5wgxFawpycLl0Ryo7Yx1U0YkgV5MWodvYVR2QI7eX6q43e4ctLuUX4rFOGxlrN3plvy8mHJWl2QCsDeO0zcS6MWkvXPSuyFDduCsG1/Q33WiLehNVpsleI5eZtyIqSY31cLsnJS4ztPLVSUm5Zm9tXz1LwdYUpjOxaW5A8fTrUl88LwS/rizhp2+nXmG3ox1uDy43B5MvgqCdl+OXoipZk1JFn+vakFrHZf3mGREL0L2+NvVfOnJ/ayfm82fP3M+6dbBJVvvv2EZn3/vfE60eHf0CZxNM7DLVMAN2cAKl0JMJatLMmnudlDbfibWTQlKrioRkr5+Nw++dIT3LMjlkY+tC1rBTynFV69czIyMZF4+2EBmQGonNaBUcUay9xeE3IwVU9WakiwAymramZWdEuPWDCeBPo509fXT0+diZmZyrJsypq3lTXT1ufjMJfPGLNP6kfNn85EhdbtTLMNLFQ+thyPEVLF4RhrJSUa2vNtAqsWExWTkvHnZgzY2OdrQTXFWckw+45K6iSP/+WI5lz74Bi8cqI91U8b0dFktM9KtXDg/d+yTg/DXpu/uC9ycRG7GiqnJZDRw7txsXjrUwJ2P7ebDj+zkzsd2D0w4eOTvJ7jyp9vY9NBbNHRGvy6OXFVx5ODpLpxuD//8RBktPedwx4VzYt2koJq7HbxR0cynN8zDaAjtxlNGsjeN03HGG+hdbg8Ol0duxoop638+uHrgflRZdTvf/b/D3P7/3mbt7Cwe/cdJLlqQw76aDjY99BaPfWL9wGbl0SBXVZzweDTHmnu4ff0sWnqcfOu5Q6Qnm7hxdXGsmzbMs/tO4/ZoNq0pCvk1snx7bfpreZ9dPSs5ejE1pVmTWFGcCcCK4kyKs1L4/B/LOFDbyYfOK+G71y+jvL6Ljz26iyt+8iZJBm9C5erlM/jpbasj2jYJ9HGioauPXqebc2ZmcNu5s7jqZ9v53x3VcRnoN5edZmVxBgvy00J+Df8q2nbfvrL+VbKSuhGJ4vKlBWz+7IVUNHZz4+oilFIsK8rgmc9dxF92n6LfowFvfj/S5KqKE/69K+fnpWIyGrh5bTH/ueUIx5p7mJ8X3j/xnnynhvx0K+9dlD/h5x6q66S8vovvXHfOpNqQnpyEUoEjeqlcKRLPsqIMlhVlDDo2KzuFL79/UVTbITdj48SxZm+g9+ftblhdhEHBM2Wnw/o+Rxu6uffpd/nhS0dDev7Pt1ZiMxu5buXMSbXDaFBkJCfR7iufILtLCRE5EujjRFVTDxnJSeT6SgcUpFt5z8I8ntl7Go/vT7xw+M8t5Xg0lNd30dQ9sbv/u0608fKhRj576fyBomWTkZViHrgZG6zCpRAiPCTQx4mqph7m59kGLZ/etKaI0x1nePtEeHav2V7ZzBtHm7lhlXc0vr2iZdzP9Xg0979wmBnpVu58T3h2vc9MSRpI3QSrcCmECA8J9HHiWLN92HSrK8+ZQZrFxOY9k0/fuD2a+18opzgrmQc2rSDHZmZbZfO4n/9/79azv7aTr1y5iOQwbQySmZxE+9AcvUyvFCLsJnVVKaUygd8AywANfAI4CjwJzAFOArdqreO3rFsc6Oztp6XHMeymqzXJyNXLC3n+QB3337gMa1LoAfbpslqONHTz37evxppk5OKFuWyrbMHj0RhGmAv/+NvVfOe5Q7h8qaOlhencuDr0KZVDZaWYqWj03puwB9luUAgRHpMd0f8MeElrvRhYCZQD9wBbtdYLga2+n8UoqobciA20cfkMep1udvkqQIbijNPNj145yspZmVy7ohCADaV5tNmdHKrrCvqc5m4HD7xYzvLiDL5w2UL+9fKF/Poja0NeIBVMZop5WOpGbsYKEX4hX1VKqXRgA/AxAK21E3Aqpa4HLvWd9hjwBnD3ZBqZ6I4FTK0c6vy5OZhNBrZVNLOhNC+k1//N9uM0djn4xQfXDNwDuHih97W2VTazvDhj2HN+trUCh8vDf92yknlhnt7pl5WShN3pxunyYHe4MCiwJkk2UYhwm8xVNQ9oBh5VSu1VSv1GKWUDCrTW9QC+r0EnayulPq2U2q2U2t3cPP5ccSKqau7BbDIErXqXbDayfk72hPLpgZq6+3jozWNceU4B587JHjiel2ZhaWE6b1YMf92qpm6e2HWKD51XErEgD5Dpm7nT0evdV9Zmkf1ihYiEyQR6E7AGeEhrvRqwM4E0jdb6Ya31Oq31ury80EaqieJYUw/zcm0jpkUuXphLRWMP9Z0Tr3X909cqcbo83L1x8bDHNpTmUVbdPqiwGMADW46QkmTkC5ctnPD7TYS/DEJ7b7/sLiVEBE0m0NcCtVrrnb6fn8Ib+BuVUoUAvq9Nk2ti4qsaY/WrP2UzkemQAJWN3Tz5zsgj80sX5eHyaN44enZUf6Shi9fKm7jr0vnkpFom9H4TNVAGodeJ3SklioWIlJADvda6ATillPKv5b0MOAw8B9zhO3YH8OykWpjg+vrdnGrrZf4olewWz0gjP83CmxNM3/hH5l+8vDTo4+fOyaYww8rTZbUDxzbvqSXJqLh9fcmE3isUmQGFzWR3KSEiZ7JX1r8Af1BKmYHjwMfx/vL4s1LqTqAGuGWS75HQqlt78WiYn2cb8RylFBcvzOO18kbcHj2umS9vVbWw9UgT91y1mOwRVrEaDYobVxfx623HaeruIzvFzDN763jvovwRnxNOZ0f0/tSNLJYSIhImNcVBa73Pl2dfobW+QWvdrrVu1VpfprVe6Psa+rzAacBfv3pe7ug3PTeU5tJ5pp8DtR1jvqbHo7n/xXKKMpP52Bg17W9aU4zbo3luXx3bq1po6XFw05roVMwclLqRjcGFiBiZyxZjNW3eQF+SM/o+kxcvzEMp2HKwYczX/Ou+0xyq6+JrGxeNuchqQX4qK2dl8tSeWjbvqSUrJYn3LZ54VctQJJuNWEwGOnr76ZFtBIWIGAn0MXaytZeslKSBDbJHkm0zc/XyQh7edpxfvF6J1sELnfX1u3nw5aOsKM7gn1aMr8LkzWuKONLQzZaDDVy3ciZmU/Q+FpkpSbTbnbIxuBARJIE+xmpaeynJGTk/H+inH1jFTauL+NErFXzruUO4g1S1fOTvJ6jv7OMbVy8ZsbTBUNeumEmSUeH26KilbfyyUsy+HL3cjBUiUuTKirGTrXbWzs4a17lJRgM/umUleWkWfr3tOC09Dn5866qB9ExLj4OH3jjG+5cWcN68nHG3Ictm5prlhVQ197AiyCrZSMpMSaK5x4HT7SFVcvRCRIRcWTHkdHmo6zjDTRMoFGYwKO69egl5aRb+44VyWnt28YsPrsFmMfLjVyvo63dzz1XDF0eN5cFbVuL26KivTM1KMVN10lvzTkb0QkSGXFkxVNvunVo5e5ypm0CfvHgeeWkWvvKX/Zx7/2sDx++4YHZIZQuSjAYmURwzZJkpZlp6HIAUNBMiUuTKiqHqtl4AZo8x42Yk168qYnaOjZ3HvRuTpJiNbFobf5uJj8ZfBgFkRC9EpMiVFUPVvjn0oYzo/VbNymTVrMwwtSj6/HPpQXaXEiJSZNZNDFW39ZJiNg7sEzsdZQaM6CV1I0RkSKCPoerWXmbn2KZ1ad7BI3oJ9EJEggT6GKputTM7SA366STLFpCjl+mVQkSEBPoYcXs0p9rOMDt3egf6TMnRCxFxEuhjpKGrD6fbw+zs0G/EJgJJ3QgReRLoY8Q/42ZOiFMrE0W61RvcTQaFJYo1doSYTmQIFSP+OfRjVa1MdCajgXSrd6/Y6XxTWohIkkAfIydb7SQZFYUZybFuSsxl2cy43MGrcQohJk8CfYzUtPYyKztlXLtFJbrMFDNnnK5YN0OIhCVJ0Rh4Zm8trx5uZEVRdCtFxqulhWkszE+LdTOESFgyoo+y/7ftOPe/WM4F83L43g3LYt2cuPCfN62IdROESGgS6KNo855a7n+xnGtWFPLjW1diMcm8cSFE5Emgj5Jep4sfvnyEVbMy+fltqyU3L4SIGsnRh8GT79TwyzeqRj3nN9tP0Njl4L5rlkiQF0JElYzoJ6m+8wz//uwhTAbFZzbMDxrEm7r7+NWbx7hq2QzWzcmOQSuFENOZjOgn6UcvV+BwebA73VQ0dgc95yevVuJ0ebh748S3+BNCiMmSQD8JB0938vTeWjaeMwOAspr2YefYHS6e2nOKD5w7izm507uujRAiNiTQh0hrzfdfLCczOYkf3LyCbJuZsuqOYee9fbyVfrfm6uWF0W+kEEIggT5kO0+08daxVr5w2UIykpNYU5LJ3lPDR/TbKppJTjKybk5WDFophBAS6EP2l921pFlM3L6+BIDVJVkcb7bT0escdN62yhbOn5ctc+aFEDEjgT4EdoeLLQfruWZFIdYkbwBfXZIJwN6ajoHzTrX1cqLFzobSvBi0UgghvCTQh+DlQw30Ot3ctKZ44NjK4kwMavAN2TcrmgEk0AshYkoCfQg2l9VSkp3CuQF5d5vFxOIZ6YMC/baKZooyk5kns22EEDEkgX6C6jrO8NaxVm5aUzRso4w1szPZf6oTt0fT7/bw1rFWNpTmyYYaQoiYkkDv0+MYuR661prTHWeobrXz+7er0RpuWl087LzVs7Locbj4e1ULrx1upMfh4pLS3Eg2WwghxiQlEIDfv13Nvz97kA+dN5tvX3fOoDIGTpeHuzcf4Jm9pweOrZ+THXQLQP8Uyjt+uwvw7oN6wXwJ9EKI2JrWgV5rzU9ereDnr1cxL9fG429X09Lj4CcfWIU1yYjd4eKu3+9he2ULn9kwj0UzvJtjrJ8bvF7N7Bwb//uJ9bT0OAAoyU4hIzkpav0RQohgpm2gd7k93PfXg/zpnVPcuq6Y79+4nN+9dZL/eKGcA7VvkpGcRJvdSXOPgx/evIJb180a1+vKDBshRLyZloH+jNPNvzyxl9fKG/nn9y7g395filKKT148j+KsZDaXnUZrmJWdzAfPm80lEryFEFPYpAO9UsoI7AZOa62vVUplA08Cc4CTwK1a6+G1AaLs5UMNVDR4q0u+frSJfac6+O715/DRC+YMOm/jskI2LpO6NEKIxBGOEf0XgXIg3ffzPcBWrfUDSql7fD/fHYb3CYnWmgdfPsov3zg2cMxmNvKL29dwzQoJ6EKIxDepQK+UKgauAe4Hvuw7fD1wqe/7x4A3iEKg73d7qG61DzqmNTy87Th/2VPL7etL+PZ1SzEqhUEpDLLLkxBimpjsiP6nwNeAtIBjBVrregCtdb1SKn+S7zGmpu4+Pv7oOxyq6wr6+BcvW8i/Xr5QFi4JIaalkAO9UupaoElrvUcpdWkIz/808GmAkpKSUJvBiRY7H/3tTlp7nHz3+nPISjEPenxGhpVzZfs+IcQ0NpkR/UXAdUqpqwErkK6U+j3QqJQq9I3mC4GmYE/WWj8MPAywbt06HUoDDtd18ZFHdqKBJz51PitnZYbyMkIIkdBCLoGgtb5Xa12stZ4D3Aa8rrX+MPAccIfvtDuAZyfdyhHkp1tYOjOdp+66QIK8EEKMIBLz6B8A/qyUuhOoAW6JwHsAkJtq4fE7z4vUywshREIIS6DXWr+Bd3YNWutW4LJwvK4QQojJk+qVQgiR4CTQCyFEgpNAL4QQCU4CvRBCJDgJ9EIIkeAk0AshRIKTQC+EEAlOaR1S9YHwNkKpZqB6Ei+RC7SEqTmxlkh9gcTqTyL1BRKrP4nUFxh/f2ZrrcfcGSkuAv1kKaV2a63Xxbod4ZBIfYHE6k8i9QUSqz+J1BcIf38kdSOEEAlOAr0QQiS4RAn0D8e6AWGUSH2BxOpPIvUFEqs/idQXCHN/EiJHL4QQYmSJMqIXQggxAgn0QgiR4OIy0CulfquUalJKHQw4tlIptUMp9a5S6nmlVHrAY/cqpaqUUkeVUlcGHF/rO79KKfVzFaPdwSfSH6XUFUqpPb7je5RS74un/kz0/43v8RKlVI9S6isBx2LeF187JvpZW+F77JDvcWu89GeCn7MkpdRjvuPlSql7A54TD32ZpZT6m69th5RSX/Qdz1ZKvaqUqvR9zQp4TtzGgYn2J+xxQGsdd/8BG4A1wMGAY+8Al/i+/wTwPd/3S4H9gAWYCxwDjL7HdgEXAArYAlw1BfqzGpjp+34ZcDrgOTHvz0T6EvD4ZuAvwFfiqS8h/L8xAQeAlb6fc+LpszbBvnwQ+JPv+xTgJDAnjvpSCKzxfZ8GVPiu9R8C9/iO3wP8wPd9XMeBEPoT1jgQ9QtrAv8wc4Z8YLs4e/N4FnDY9/29wL0B573s+0coBI4EHL8d+HW892fIcxTQ6vvwxk1/JtIX4AbgQeDb+AJ9PPVlgp+1q4HfB3l+3PRnAn25HXge7y+vHF/gyY6nvgzp17PAFcBRoDDg3/2o7/spEQfG258h5046DsRl6mYEB4HrfN/fgvdDC1AEnAo4r9Z3rMj3/dDj8WKk/gTaBOzVWjuI7/4E7YtSygbcDXxnyPnx3BcY+f9NKaCVUi8rpcqUUl/zHY/n/ozUl6cAO1CPd2/nH2mt24jDviil5uAd4e4ECrTW9QC+r/m+06ZMHBhnfwJNOg5MpUD/CeDzSqk9eP/0cfqOB8tP6VGOx4uR+gOAUuoc4AfAZ/yHgrxGvPRnpL58B/iJ1rpnyPnx3BcYuT8m4D3Ah3xfb1RKXUZ892ekvqwH3MBMvKmOf1NKzSPO+qKUSsWb+vtXrXXXaKcGORZ3cWAC/fGfH5Y4EJbNwaNBa30EeD+AUqoUuMb3UC2DR8PFQJ3veHGQ43FhlP6glCoGngE+qrU+5jsct/0ZpS/nATcrpX4IZAIepVQf3g96XPYFxvysvam1bvE99iLenPjvidP+jNKXDwIvaa37gSal1D+AdcB24qQvSqkkvJ+VP2itn/YdblRKFWqt65VShUCT73jcx4EJ9iescWDKjOiVUvm+rwbgPuBXvoeeA25TSlmUUnOBhcAu359B3Uqp8313pT+KNy8WF0bqj1IqE3gBb77xH/7z47k/I/VFa32x1nqO1noO8FPg+1rrX8RzX2DUz9rLwAqlVIpSygRcgjfnHbf9GaUvNcD7lJcNOB9v7jcu+uJ770eAcq31jwMeeg64w/f9HQFti+s4MNH+hD0OxPqmxAg3Kp7Amzvsx/sb7E7gi3hvGFUAD+C7weQ7/xt477IfJeAONN4RykHfY78IfE689gfvxWgH9gX8lx8v/Zno/5uA532bwbNuYt6XED9rHwYO+dr+w3jqzwQ/Z6l4Z0IdAg4DX42zvrwHb0riQMB1cDXeG8dbgUrf1+yA58RtHJhof8IdB6QEghBCJLgpk7oRQggRGgn0QgiR4CTQCyFEgpNAL4QQCU4CvRBCJDgJ9CKhKaXcSql9SqmDSqm/KKVSJvDcmUqppyb4fm8opRJmk2qRGCTQi0R3Rmu9Smu9DO/y/7vG8ySllElrXae1vjmyzRMi8iTQi+lkO7BAKWVT3trt7yil9iqlrgdQSn3MN+p/HnhFKTVH+Wq7K6WsSqlHfXXA9yql3us7nqyU+pNS6oBS6kkg2XfcqJT6ne8viXeVUl+KUZ+FmDq1boSYDF/JgquAl/CuoHxda/0J31LzXUqp13ynXgCs0Fq3+aoM+n0eQGu9XCm1GO8vglLgs0Cv1nqFUmoFUOY7fxVQ5PtLwr+kXYiYkBG9SHTJSql9wG689V0ewVvk6x7f8TcAK1DiO/9V7S3XO9R7gMdhoFBYNd6yxRvwFjVDa30A7xJ3gOPAPKXUfyulNuKtCy9ETMiIXiS6M1rrVYEHfMWgNmmtjw45fh7e+iLBjLZd27A6IlrrdqXUSuBKvH8N3Iq3ZLAQUScjejEdvQz8iy/go5RaPY7nbMNbh95f7rcEb/GswOPLgBW+73MBg9Z6M/BNvOWMhYgJGdGL6eh7eMsmH/AF+5PAtWM855fAr5RS7wIu4GNaa4dS6iHgUaWUvyrhLt/5Rb7j/sHUvUNfUIhokeqVQgiR4CR1I4QQCU4CvRBCJDgJ9EIIkeAk0AshRIKTQC+EEAlOAr0QQiQ4CfRCCJHg/j+KhdOlU2OvsAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Now we can plot it, using that same command we used above - plotting a specific value (column) by the index.\n", "# In this case, the index is now the year, which provides a nice little visualization.\n", "df['Marriages_170'].plot()" ] }, { "cell_type": "markdown", "metadata": { "id": "_UFYNDedtXlf" }, "source": [ "## Play around - and some (non-graded) homework\n", "\n", "The most important thing is that you start playing around. You don't need to be able to create beautiful plots or anything fancy, but try to get datasets into a usable format and get some insights!\n", "\n", "As an exercise after class, why don't you try the following:\n", "\n", "1. Find a webpage that has a table on it.\n", "2. Use the read_html scraper tool we learned above, to scrape the tables and save these to a csv file.\n", "3. Read the csv file back into python, and try to find some basic descriptive statistics (max, min, mean, etc) and if you can, make a simple visualization out of it (histogram or plot).\n", "4. Save all of this to a new notebook--code, notes if you have questions or about what you're doing, and output.\n", "\n", "If you can do all of these things, great! If you can't bring your questions to next class.\n" ] } ], "metadata": { "colab": { "name": "02-accessing_data-2019.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 4 }