{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", " \"Open\n", " \n", "
\n", "
\n", "

\n", " 💡 Si el botó no funciona: Copia i enganxa aquesta URL al teu navegador:
\n", " \n", "https://colab.research.google.com/github/wisaaco/TallerPythonFEE/blob/main/notebooks/5_PandasGroups.ipynb\n", " \n", "

\n", "
" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Unitat 5. Agrupacions de dades\n", "\n", "Les agrupacions són operacions necessàries per a analitzar dades, ja que permeten extreure informació en funció de dades categòriques del nostre dataframe." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Començarem aquest aprenentatge amb les següents dades" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Unnamed: 0 Dni Nom CP Ciutat Genere Tipus certificat cat Punts\n", "-- ------------ --------- ---------------------------- ---- ---------- -------- ---------------------- -------\n", " 0 0 H61414629 María Dolores Arjona Jove 7800 Eivissa M B 73\n", " 1 1 S3138381C Núria Quirós 7511 Ruberts F A 40\n", " 2 2 J8698188C Miguel José María Gil Vargas 7340 Alaro M A 45\n", " 3 3 A48821615 Jordi Chaves Bustamante 7609 Bellavista F B 40\n", " 4 4 U0247281I Jana Rosa Collado Menéndez 7006 Palma M B 86\n", "------------------------------\n", "Index(['Unnamed: 0', 'Dni', 'Nom', 'CP', 'Ciutat', 'Genere',\n", " 'Tipus certificat cat', 'Punts'],\n", " dtype='object')\n", "------------------------------\n", "(1000, 8)\n" ] } ], "source": [ "import pandas as pd\n", "from tabulate import tabulate # Nova llibreria\n", "\n", "df = pd.read_csv(\"data/data_groups.csv\") # Atenció: Dades creades aleatòriament !!\n", "\n", "print(tabulate(df.head(), headers='keys'))\n", "\n", "print(\"-\"*30)\n", "print(df.columns)\n", "print(\"-\"*30)\n", "print(df.shape)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### A.4.0 Activitats \"d'escalfament\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "#A401 - Quines i quantes ciutats n'hi ha?\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "#A402 - Suprimeix la columna primera: \"Unnamed: 0\"\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "#A403 - Llista les files 3, 4 i 5 amb les columnes de Nom i Punts\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "#A404 - Crea dues columnes noves amb nom i llinatges de l'actual columna de \"Nom\".\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "#A405 - Quantes categories de \"Tipus de certificat cat\" ?" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Groups" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Per a agrupar només cal usar el mètode de `groupby`.
\n", "\n", "En el següent exemple agrupem les dades segons el sexe de la persona mitjançant el mètode `groupby` que retorna un `DataFrame` agrupat:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.groupby.generic.DataFrameGroupBy" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bySex = df.groupby('Genere')\n", "type(bySex)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "print(bySex)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bySex" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Una agupació té sentit quan se fà qualque tipus d'agregació de dades. Ara mateix la variable `bySex` només te informació computacional que separa els elements de cada tipus de grup. Hem de fer qualque operació, com per exemple, `describe()`" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0CPPunts
countmeanstdmin25%50%75%maxcountmean...75%maxcountmeanstdmin25%50%75%max
Genere
F483.0503.248447290.0762251.0245.0496.0757.5999.0483.07519.567288...7701.07870.0483.050.08488628.5485511.024.550.074.099.0
M517.0495.998066287.8771910.0253.0502.0744.0998.0517.07533.143133...7701.07870.0517.049.32108328.5042731.024.050.073.099.0
\n", "

2 rows × 24 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 \\\n", " count mean std min 25% 50% 75% max \n", "Genere \n", "F 483.0 503.248447 290.076225 1.0 245.0 496.0 757.5 999.0 \n", "M 517.0 495.998066 287.877191 0.0 253.0 502.0 744.0 998.0 \n", "\n", " CP ... Punts \\\n", " count mean ... 75% max count mean std \n", "Genere ... \n", "F 483.0 7519.567288 ... 7701.0 7870.0 483.0 50.084886 28.548551 \n", "M 517.0 7533.143133 ... 7701.0 7870.0 517.0 49.321083 28.504273 \n", "\n", " \n", " min 25% 50% 75% max \n", "Genere \n", "F 1.0 24.5 50.0 74.0 99.0 \n", "M 1.0 24.0 50.0 73.0 99.0 \n", "\n", "[2 rows x 24 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Podem descriure la informació d'un \"[DataFrame]Groupby\"\n", "bySex.describe()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Només guarda quins elements pertanyen a cada categoria.\n", "L'atribut `groups` ens mostra quines mostres/files pertanyen a cada grup:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'F': [1, 3, 6, 7, 8, 9, 12, 23, 25, 26, 28, 29, 32, 33, 35, 36, 38, 42, 43, 44, 45, 48, 53, 54, 57, 58, 59, 63, 65, 70, 71, 72, 79, 80, 81, 82, 84, 87, 88, 92, 94, 95, 97, 98, 99, 103, 105, 106, 107, 108, 111, 112, 115, 116, 117, 118, 120, 125, 128, 129, 132, 134, 135, 136, 137, 138, 140, 141, 142, 143, 144, 147, 148, 150, 151, 153, 154, 155, 156, 161, 162, 163, 164, 165, 168, 170, 172, 174, 175, 176, 177, 178, 179, 181, 182, 184, 185, 188, 194, 196, ...], 'M': [0, 2, 4, 5, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 27, 30, 31, 34, 37, 39, 40, 41, 46, 47, 49, 50, 51, 52, 55, 56, 60, 61, 62, 64, 66, 67, 68, 69, 73, 74, 75, 76, 77, 78, 83, 85, 86, 89, 90, 91, 93, 96, 100, 101, 102, 104, 109, 110, 113, 114, 119, 121, 122, 123, 124, 126, 127, 130, 131, 133, 139, 145, 146, 149, 152, 157, 158, 159, 160, 166, 167, 169, 171, 173, 180, 183, 186, 187, 189, 190, 191, 192, 193, 195, 197, 199, 201, ...]}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bySex.groups # Quin tipus de variable és? Quina informació conté?" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Aquest atribut, (un dictionari) ens permet realitzar operacions de filtrat en funció dels grups:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0DniNomCPCiutatGenereTipus certificat catPunts
00H61414629María Dolores Arjona Jove7800EivissaMB73
22J8698188CMiguel José María Gil Vargas7340AlaroMA45
44U0247281IJana Rosa Collado Menéndez7006PalmaMB86
55V3761435AAntonio Murillo Suarez7701MaoMB53
1010A64863723Nil Mínguez Hervás7350BinissalemMB17
...........................
991991G28994028Juan Cervera Arribas7006PalmaMB83
993993J3906733EJosefa María Dolores Barreda Ibañez7609BellavistaMB17
994994C26665125Josefa Ferrando-Navas7340AlaroMA56
996996H72967045Álex Javier Campos Palomar7701MaoMB72
998998C61296679Èric Pinedo7009PalmaMA1
\n", "

517 rows × 8 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 Dni Nom CP \\\n", "0 0 H61414629 María Dolores Arjona Jove 7800 \n", "2 2 J8698188C Miguel José María Gil Vargas 7340 \n", "4 4 U0247281I Jana Rosa Collado Menéndez 7006 \n", "5 5 V3761435A Antonio Murillo Suarez 7701 \n", "10 10 A64863723 Nil Mínguez Hervás 7350 \n", ".. ... ... ... ... \n", "991 991 G28994028 Juan Cervera Arribas 7006 \n", "993 993 J3906733E Josefa María Dolores Barreda Ibañez 7609 \n", "994 994 C26665125 Josefa Ferrando-Navas 7340 \n", "996 996 H72967045 Álex Javier Campos Palomar 7701 \n", "998 998 C61296679 Èric Pinedo 7009 \n", "\n", " Ciutat Genere Tipus certificat cat Punts \n", "0 Eivissa M B 73 \n", "2 Alaro M A 45 \n", "4 Palma M B 86 \n", "5 Mao M B 53 \n", "10 Binissalem M B 17 \n", ".. ... ... ... ... \n", "991 Palma M B 83 \n", "993 Bellavista M B 17 \n", "994 Alaro M A 56 \n", "996 Mao M B 72 \n", "998 Palma M A 1 \n", "\n", "[517 rows x 8 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dfM = df.loc[bySex.groups['M'].values] #Recorda que \"loc\" accedeix per index de fila\n", "dfM" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Obviament:
\n", "Segons el tipus d'agrupació, l'operació pot realitzar-se com una selecció lògica." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0DniNomCPCiutatGenereTipus certificat catPunts
00H61414629María Dolores Arjona Jove7800EivissaMB73
22J8698188CMiguel José María Gil Vargas7340AlaroMA45
44U0247281IJana Rosa Collado Menéndez7006PalmaMB86
55V3761435AAntonio Murillo Suarez7701MaoMB53
1010A64863723Nil Mínguez Hervás7350BinissalemMB17
...........................
991991G28994028Juan Cervera Arribas7006PalmaMB83
993993J3906733EJosefa María Dolores Barreda Ibañez7609BellavistaMB17
994994C26665125Josefa Ferrando-Navas7340AlaroMA56
996996H72967045Álex Javier Campos Palomar7701MaoMB72
998998C61296679Èric Pinedo7009PalmaMA1
\n", "

517 rows × 8 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 Dni Nom CP \\\n", "0 0 H61414629 María Dolores Arjona Jove 7800 \n", "2 2 J8698188C Miguel José María Gil Vargas 7340 \n", "4 4 U0247281I Jana Rosa Collado Menéndez 7006 \n", "5 5 V3761435A Antonio Murillo Suarez 7701 \n", "10 10 A64863723 Nil Mínguez Hervás 7350 \n", ".. ... ... ... ... \n", "991 991 G28994028 Juan Cervera Arribas 7006 \n", "993 993 J3906733E Josefa María Dolores Barreda Ibañez 7609 \n", "994 994 C26665125 Josefa Ferrando-Navas 7340 \n", "996 996 H72967045 Álex Javier Campos Palomar 7701 \n", "998 998 C61296679 Èric Pinedo 7009 \n", "\n", " Ciutat Genere Tipus certificat cat Punts \n", "0 Eivissa M B 73 \n", "2 Alaro M A 45 \n", "4 Palma M B 86 \n", "5 Mao M B 53 \n", "10 Binissalem M B 17 \n", ".. ... ... ... ... \n", "991 Palma M B 83 \n", "993 Bellavista M B 17 \n", "994 Alaro M A 56 \n", "996 Mao M B 72 \n", "998 Palma M A 1 \n", "\n", "[517 rows x 8 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df.Genere==\"M\"]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "**********\n", "F\n", "**********\n", " Unnamed: 0 Dni Nom CP Ciutat Genere \\\n", "1 1 S3138381C Núria Quirós 7511 Ruberts F \n", "3 3 A48821615 Jordi Chaves Bustamante 7609 Bellavista F \n", "6 6 H16936148 Arlet del Rovira 7350 Binissalem F \n", "7 7 E47121991 Emma Tapia Salas 7340 Alaro F \n", "8 8 E15932486 Jana Barrios 7870 La Savina F \n", "\n", " Tipus certificat cat Punts \n", "1 A 40 \n", "3 B 40 \n", "6 A 54 \n", "7 C 60 \n", "8 A 35 \n", "------------------------------------------------------------\n", "**********\n", "M\n", "**********\n", " Unnamed: 0 Dni Nom CP Ciutat \\\n", "0 0 H61414629 María Dolores Arjona Jove 7800 Eivissa \n", "2 2 J8698188C Miguel José María Gil Vargas 7340 Alaro \n", "4 4 U0247281I Jana Rosa Collado Menéndez 7006 Palma \n", "5 5 V3761435A Antonio Murillo Suarez 7701 Mao \n", "10 10 A64863723 Nil Mínguez Hervás 7350 Binissalem \n", "\n", " Genere Tipus certificat cat Punts \n", "0 M B 73 \n", "2 M A 45 \n", "4 M B 86 \n", "5 M B 53 \n", "10 M B 17 \n", "------------------------------------------------------------\n" ] } ], "source": [ "#Cada dataframegroupby te dos elements: el id-value del grup i el df de elements d'aquest grup\n", "for identificador, group in df.groupby('Genere'): \n", " # Criteri de l'agrupacio: M o F \n", " print(\"*\"*10)\n", " print(identificador)\n", " print(\"*\"*10)\n", " # Dataframe específic\n", " print(group.head())\n", " \n", " print(\"-\"*60)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Agregacions\n", "\n", "\n", "El mètodo `aggregate` ens permet crear variables d'agregació sobre la taula obtinguda amb `groupby`.
Indicarem la informació que volem obtenir de cada columna amb un diccionari. Especifiquem la funció que aplicarem a les dades de cada grup en cada columna per a obtenir un únic valor.\n", "\n", "- Ref: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.aggregate.html\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Punts
Tipus certificat cat
A22502
B17289
C7399
D2500
\n", "
" ], "text/plain": [ " Punts\n", "Tipus certificat cat \n", "A 22502\n", "B 17289\n", "C 7399\n", "D 2500" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.groupby([\"Tipus certificat cat\"]).aggregate(\n", " { # Dictionari\n", " \"Punts\":sum # Columna : Operacion de agregació\n", " }\n", " ) # " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Podem aplicar un gran nombre de funcions d'agregació:\n", "\n", "- [Funcions estadístiques](https://docs.scipy.org/doc/numpy/reference/routines.statistics.html): mean, std, ...\n", "\n", "- [Funcions matemàtiques](https://docs.scipy.org/doc/numpy/reference/routines.math.html): sum, prod, ...\n", "\n", "- Altres funcions: max, min, ...\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Punts
Tipus certificat cat
A99
B99
C99
D99
\n", "
" ], "text/plain": [ " Punts\n", "Tipus certificat cat \n", "A 99\n", "B 99\n", "C 99\n", "D 99" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.groupby([\"Tipus certificat cat\"]).aggregate({\"Punts\":max}) " ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Si volem incloure la mateixa columna múltiples vegades, hem de fer servir el mètode `agg`" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MaximoMinimMitj
Tipus certificat cat
A99150.115813
B99149.681034
C99148.045455
D99151.020408
\n", "
" ], "text/plain": [ " Maximo Minim Mitj\n", "Tipus certificat cat \n", "A 99 1 50.115813\n", "B 99 1 49.681034\n", "C 99 1 48.045455\n", "D 99 1 51.020408" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "## Una manera\n", "dftipus = df.groupby([\"Tipus certificat cat\"]).agg(Maximo=('Punts', max), Minim=('Punts', 'min'), Mitj=('Punts', np.mean))\n", "dftipus" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.frame.DataFrame" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(dftipus) #Nota: Una vegada feta una agregació damunt una agrupació obtenim un dataframe !!!" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MaximoMinimMitj
Tipus certificat cat
B99149.681034
C99148.045455
\n", "
" ], "text/plain": [ " Maximo Minim Mitj\n", "Tipus certificat cat \n", "B 99 1 49.681034\n", "C 99 1 48.045455" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftipus[dftipus.Mitj<50]" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1. Activitats \n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "# 4.1.1 Quina estructura té aquest dataframe: shape, index, columns?\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# 4.1.2 Pots accedir al valor mínim del certificat \"C\"\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "# 4.1.3 Pots canviar el nom a les columnes a \"max punts\", \"min punts\" i \"mitjana punts\", respectivament." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Altres maneres de fer aggregacions" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Punts
meancountaminstd
Tipus certificat cat
A50.115813449128.518712
B49.681034348128.271817
C48.045455154127.907724
D51.02040849132.520180
\n", "
" ], "text/plain": [ " Punts \n", " mean count amin std\n", "Tipus certificat cat \n", "A 50.115813 449 1 28.518712\n", "B 49.681034 348 1 28.271817\n", "C 48.045455 154 1 27.907724\n", "D 51.020408 49 1 32.520180" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Una altra manera\n", "dftipus2 = df.groupby([\"Tipus certificat cat\"]).agg({\n", " 'Punts': ['mean', 'count', np.min, np.std], \n", " })\n", "\n", "dftipus2 " ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('Punts', 'mean'),\n", " ('Punts', 'count'),\n", " ('Punts', 'amin'),\n", " ('Punts', 'std')],\n", " )" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftipus2.columns ## ALERTA ! MULTIINDEX" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Tipus certificat cat\n", "A 50.115813\n", "B 49.681034\n", "C 48.045455\n", "D 51.020408\n", "Name: mean, dtype: float64" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Si volem accedir a una columna específica\n", "\n", "dftipus2[\"Punts\"][\"mean\"] ### Obliga a fer ús d'un conjunt seguit de \"keys\" de columnes" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Tipus certificat cat\n", "A 50.115813\n", "B 49.681034\n", "C 48.045455\n", "D 51.020408\n", "Name: (Punts, mean), dtype: float64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftipus2[(\"Punts\",\"mean\")] ### O fer us del concepte de tupla" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Agrupacions amb múltiples columnes\n", "\n", "En una agrupació amb diversos criteris, es creen totes les combinacions possibles entre els criteris donant lloc a un producte cartesià de les agregacions indicades." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0CPPunts
Tipus certificat catCiutat
AAlaro447.9298257340.00000050.789474
Ariany547.2037047529.00000052.537037
Bellavista499.1489367609.00000046.787234
Binissalem535.3750007350.00000044.325000
Eivissa504.9318187800.00000052.227273
La Savina468.1944447870.00000047.944444
Mao528.9189197701.86486547.027027
Palma471.9183677009.46938851.081633
Pedruscada475.3478267590.00000056.847826
Ruberts503.4871797511.00000049.128205
BAlaro485.3333337340.00000052.452381
Ariany496.4047627529.00000049.166667
Bellavista540.8518527609.00000044.185185
Binissalem560.9090917350.00000052.787879
Eivissa525.4848487800.00000047.393939
La Savina524.4705887870.00000044.441176
Mao497.0408167701.77551050.673469
Palma493.0625007008.62500055.375000
Pedruscada486.6060617590.00000050.484848
Ruberts391.9565227511.00000047.391304
CAlaro451.6250007340.00000039.687500
Ariany606.5833337529.00000041.833333
Bellavista503.1111117609.00000037.500000
Binissalem478.0588247350.00000067.823529
Eivissa525.7777787800.00000041.444444
La Savina508.3571437870.00000052.500000
Mao433.4285717701.85714352.428571
Palma371.3333337011.66666733.083333
Pedruscada554.0000007590.00000053.166667
Ruberts541.8235297511.00000051.235294
DAlaro203.2000007340.00000066.800000
Ariany446.6000007529.00000047.200000
Bellavista728.8333337609.00000056.333333
Binissalem470.0000007350.00000067.666667
Eivissa581.8000007800.00000035.800000
La Savina455.3333337870.00000043.000000
Mao467.0000007703.00000020.000000
Palma613.6666677008.00000030.333333
Pedruscada496.7777787590.00000066.777778
Ruberts636.8333337511.00000040.000000
\n", "
" ], "text/plain": [ " Unnamed: 0 CP Punts\n", "Tipus certificat cat Ciutat \n", "A Alaro 447.929825 7340.000000 50.789474\n", " Ariany 547.203704 7529.000000 52.537037\n", " Bellavista 499.148936 7609.000000 46.787234\n", " Binissalem 535.375000 7350.000000 44.325000\n", " Eivissa 504.931818 7800.000000 52.227273\n", " La Savina 468.194444 7870.000000 47.944444\n", " Mao 528.918919 7701.864865 47.027027\n", " Palma 471.918367 7009.469388 51.081633\n", " Pedruscada 475.347826 7590.000000 56.847826\n", " Ruberts 503.487179 7511.000000 49.128205\n", "B Alaro 485.333333 7340.000000 52.452381\n", " Ariany 496.404762 7529.000000 49.166667\n", " Bellavista 540.851852 7609.000000 44.185185\n", " Binissalem 560.909091 7350.000000 52.787879\n", " Eivissa 525.484848 7800.000000 47.393939\n", " La Savina 524.470588 7870.000000 44.441176\n", " Mao 497.040816 7701.775510 50.673469\n", " Palma 493.062500 7008.625000 55.375000\n", " Pedruscada 486.606061 7590.000000 50.484848\n", " Ruberts 391.956522 7511.000000 47.391304\n", "C Alaro 451.625000 7340.000000 39.687500\n", " Ariany 606.583333 7529.000000 41.833333\n", " Bellavista 503.111111 7609.000000 37.500000\n", " Binissalem 478.058824 7350.000000 67.823529\n", " Eivissa 525.777778 7800.000000 41.444444\n", " La Savina 508.357143 7870.000000 52.500000\n", " Mao 433.428571 7701.857143 52.428571\n", " Palma 371.333333 7011.666667 33.083333\n", " Pedruscada 554.000000 7590.000000 53.166667\n", " Ruberts 541.823529 7511.000000 51.235294\n", "D Alaro 203.200000 7340.000000 66.800000\n", " Ariany 446.600000 7529.000000 47.200000\n", " Bellavista 728.833333 7609.000000 56.333333\n", " Binissalem 470.000000 7350.000000 67.666667\n", " Eivissa 581.800000 7800.000000 35.800000\n", " La Savina 455.333333 7870.000000 43.000000\n", " Mao 467.000000 7703.000000 20.000000\n", " Palma 613.666667 7008.000000 30.333333\n", " Pedruscada 496.777778 7590.000000 66.777778\n", " Ruberts 636.833333 7511.000000 40.000000" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "dftc = df.groupby([\"Tipus certificat cat\",\"Ciutat\"]).mean(numeric_only = True)\n", "dftc" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Observacions\n", "- És necessari la mitja de \"CP\"?\n", " \n", "### 4.2 Activitat\n" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# 4.2.1 Fes que només surti la columna de \"Punts\" a l'agrupació anterior.\n" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "# 4.2.2 Pots seleccionar només les mostres de Mao." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "# 4.2.3 Com podries calcular la mida de cada grup?" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## MultIndex \n", "\n", "A vegades un índex no és suficient per expressar la meta-informació que identifica una o algunes columnes. Per exemple, una coordenada està formada per la latitud i longitud.\n", "\n", "Un `multindex` és una jerarquia d'índexs.\n", "\n", "Una agrupació amb diferents criteris origina un multiindex" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0CPPunts
Tipus certificat catCiutat
AAlaro447.9298257340.00000050.789474
Ariany547.2037047529.00000052.537037
Bellavista499.1489367609.00000046.787234
Binissalem535.3750007350.00000044.325000
Eivissa504.9318187800.00000052.227273
La Savina468.1944447870.00000047.944444
Mao528.9189197701.86486547.027027
Palma471.9183677009.46938851.081633
Pedruscada475.3478267590.00000056.847826
Ruberts503.4871797511.00000049.128205
BAlaro485.3333337340.00000052.452381
Ariany496.4047627529.00000049.166667
Bellavista540.8518527609.00000044.185185
Binissalem560.9090917350.00000052.787879
Eivissa525.4848487800.00000047.393939
La Savina524.4705887870.00000044.441176
Mao497.0408167701.77551050.673469
Palma493.0625007008.62500055.375000
Pedruscada486.6060617590.00000050.484848
Ruberts391.9565227511.00000047.391304
CAlaro451.6250007340.00000039.687500
Ariany606.5833337529.00000041.833333
Bellavista503.1111117609.00000037.500000
Binissalem478.0588247350.00000067.823529
Eivissa525.7777787800.00000041.444444
La Savina508.3571437870.00000052.500000
Mao433.4285717701.85714352.428571
Palma371.3333337011.66666733.083333
Pedruscada554.0000007590.00000053.166667
Ruberts541.8235297511.00000051.235294
DAlaro203.2000007340.00000066.800000
Ariany446.6000007529.00000047.200000
Bellavista728.8333337609.00000056.333333
Binissalem470.0000007350.00000067.666667
Eivissa581.8000007800.00000035.800000
La Savina455.3333337870.00000043.000000
Mao467.0000007703.00000020.000000
Palma613.6666677008.00000030.333333
Pedruscada496.7777787590.00000066.777778
Ruberts636.8333337511.00000040.000000
\n", "
" ], "text/plain": [ " Unnamed: 0 CP Punts\n", "Tipus certificat cat Ciutat \n", "A Alaro 447.929825 7340.000000 50.789474\n", " Ariany 547.203704 7529.000000 52.537037\n", " Bellavista 499.148936 7609.000000 46.787234\n", " Binissalem 535.375000 7350.000000 44.325000\n", " Eivissa 504.931818 7800.000000 52.227273\n", " La Savina 468.194444 7870.000000 47.944444\n", " Mao 528.918919 7701.864865 47.027027\n", " Palma 471.918367 7009.469388 51.081633\n", " Pedruscada 475.347826 7590.000000 56.847826\n", " Ruberts 503.487179 7511.000000 49.128205\n", "B Alaro 485.333333 7340.000000 52.452381\n", " Ariany 496.404762 7529.000000 49.166667\n", " Bellavista 540.851852 7609.000000 44.185185\n", " Binissalem 560.909091 7350.000000 52.787879\n", " Eivissa 525.484848 7800.000000 47.393939\n", " La Savina 524.470588 7870.000000 44.441176\n", " Mao 497.040816 7701.775510 50.673469\n", " Palma 493.062500 7008.625000 55.375000\n", " Pedruscada 486.606061 7590.000000 50.484848\n", " Ruberts 391.956522 7511.000000 47.391304\n", "C Alaro 451.625000 7340.000000 39.687500\n", " Ariany 606.583333 7529.000000 41.833333\n", " Bellavista 503.111111 7609.000000 37.500000\n", " Binissalem 478.058824 7350.000000 67.823529\n", " Eivissa 525.777778 7800.000000 41.444444\n", " La Savina 508.357143 7870.000000 52.500000\n", " Mao 433.428571 7701.857143 52.428571\n", " Palma 371.333333 7011.666667 33.083333\n", " Pedruscada 554.000000 7590.000000 53.166667\n", " Ruberts 541.823529 7511.000000 51.235294\n", "D Alaro 203.200000 7340.000000 66.800000\n", " Ariany 446.600000 7529.000000 47.200000\n", " Bellavista 728.833333 7609.000000 56.333333\n", " Binissalem 470.000000 7350.000000 67.666667\n", " Eivissa 581.800000 7800.000000 35.800000\n", " La Savina 455.333333 7870.000000 43.000000\n", " Mao 467.000000 7703.000000 20.000000\n", " Palma 613.666667 7008.000000 30.333333\n", " Pedruscada 496.777778 7590.000000 66.777778\n", " Ruberts 636.833333 7511.000000 40.000000" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftc = df.groupby([\"Tipus certificat cat\",\"Ciutat\"]).mean(numeric_only = True) # evitant el futureWarning...\n", "dftc" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('A', 'Alaro'),\n", " ('A', 'Ariany'),\n", " ('A', 'Bellavista'),\n", " ('A', 'Binissalem'),\n", " ('A', 'Eivissa'),\n", " ('A', 'La Savina'),\n", " ('A', 'Mao'),\n", " ('A', 'Palma'),\n", " ('A', 'Pedruscada'),\n", " ('A', 'Ruberts'),\n", " ('B', 'Alaro'),\n", " ('B', 'Ariany'),\n", " ('B', 'Bellavista'),\n", " ('B', 'Binissalem'),\n", " ('B', 'Eivissa'),\n", " ('B', 'La Savina'),\n", " ('B', 'Mao'),\n", " ('B', 'Palma'),\n", " ('B', 'Pedruscada'),\n", " ('B', 'Ruberts'),\n", " ('C', 'Alaro'),\n", " ('C', 'Ariany'),\n", " ('C', 'Bellavista'),\n", " ('C', 'Binissalem'),\n", " ('C', 'Eivissa'),\n", " ('C', 'La Savina'),\n", " ('C', 'Mao'),\n", " ('C', 'Palma'),\n", " ('C', 'Pedruscada'),\n", " ('C', 'Ruberts'),\n", " ('D', 'Alaro'),\n", " ('D', 'Ariany'),\n", " ('D', 'Bellavista'),\n", " ('D', 'Binissalem'),\n", " ('D', 'Eivissa'),\n", " ('D', 'La Savina'),\n", " ('D', 'Mao'),\n", " ('D', 'Palma'),\n", " ('D', 'Pedruscada'),\n", " ('D', 'Ruberts')],\n", " names=['Tipus certificat cat', 'Ciutat'])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftc.index" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0CPPunts
Ciutat
Alaro447.9298257340.00000050.789474
Ariany547.2037047529.00000052.537037
Bellavista499.1489367609.00000046.787234
Binissalem535.3750007350.00000044.325000
Eivissa504.9318187800.00000052.227273
La Savina468.1944447870.00000047.944444
Mao528.9189197701.86486547.027027
Palma471.9183677009.46938851.081633
Pedruscada475.3478267590.00000056.847826
Ruberts503.4871797511.00000049.128205
\n", "
" ], "text/plain": [ " Unnamed: 0 CP Punts\n", "Ciutat \n", "Alaro 447.929825 7340.000000 50.789474\n", "Ariany 547.203704 7529.000000 52.537037\n", "Bellavista 499.148936 7609.000000 46.787234\n", "Binissalem 535.375000 7350.000000 44.325000\n", "Eivissa 504.931818 7800.000000 52.227273\n", "La Savina 468.194444 7870.000000 47.944444\n", "Mao 528.918919 7701.864865 47.027027\n", "Palma 471.918367 7009.469388 51.081633\n", "Pedruscada 475.347826 7590.000000 56.847826\n", "Ruberts 503.487179 7511.000000 49.128205" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftc.loc[\"A\"] # primer index" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "ename": "KeyError", "evalue": "'Mao'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexes/base.py:3802\u001b[0m, in \u001b[0;36mIndex.get_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m 3801\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 3802\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_engine\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget_loc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcasted_key\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 3803\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/_libs/index.pyx:138\u001b[0m, in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/_libs/index.pyx:165\u001b[0m, in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n", "File \u001b[0;32mpandas/_libs/hashtable_class_helper.pxi:5745\u001b[0m, in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n", "File \u001b[0;32mpandas/_libs/hashtable_class_helper.pxi:5753\u001b[0m, in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n", "\u001b[0;31mKeyError\u001b[0m: 'Mao'", "\nThe above exception was the direct cause of the following exception:\n", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[37], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdftc\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mloc\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mMao\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;66;03m# dependent index ??? COM ES POT ACCEDIR?\u001b[39;00m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexing.py:1073\u001b[0m, in \u001b[0;36m_LocationIndexer.__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1070\u001b[0m axis \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39maxis \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;241m0\u001b[39m\n\u001b[1;32m 1072\u001b[0m maybe_callable \u001b[38;5;241m=\u001b[39m com\u001b[38;5;241m.\u001b[39mapply_if_callable(key, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj)\n\u001b[0;32m-> 1073\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_getitem_axis\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmaybe_callable\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexing.py:1312\u001b[0m, in \u001b[0;36m_LocIndexer._getitem_axis\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 1310\u001b[0m \u001b[38;5;66;03m# fall thru to straight lookup\u001b[39;00m\n\u001b[1;32m 1311\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_validate_key(key, axis)\n\u001b[0;32m-> 1312\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_label\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexing.py:1260\u001b[0m, in \u001b[0;36m_LocIndexer._get_label\u001b[0;34m(self, label, axis)\u001b[0m\n\u001b[1;32m 1258\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_get_label\u001b[39m(\u001b[38;5;28mself\u001b[39m, label, axis: \u001b[38;5;28mint\u001b[39m):\n\u001b[1;32m 1259\u001b[0m \u001b[38;5;66;03m# GH#5567 this will fail if the label is not present in the axis.\u001b[39;00m\n\u001b[0;32m-> 1260\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mobj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mxs\u001b[49m\u001b[43m(\u001b[49m\u001b[43mlabel\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/generic.py:4049\u001b[0m, in \u001b[0;36mNDFrame.xs\u001b[0;34m(self, key, axis, level, drop_level)\u001b[0m\n\u001b[1;32m 4046\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_consolidate_inplace()\n\u001b[1;32m 4048\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(index, MultiIndex):\n\u001b[0;32m-> 4049\u001b[0m loc, new_index \u001b[38;5;241m=\u001b[39m \u001b[43mindex\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_loc_level\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 4050\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m drop_level:\n\u001b[1;32m 4051\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m lib\u001b[38;5;241m.\u001b[39mis_integer(loc):\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexes/multi.py:3160\u001b[0m, in \u001b[0;36mMultiIndex._get_loc_level\u001b[0;34m(self, key, level)\u001b[0m\n\u001b[1;32m 3158\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m indexer, maybe_mi_droplevels(indexer, ilevels)\n\u001b[1;32m 3159\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 3160\u001b[0m indexer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_level_indexer\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mlevel\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 3161\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m 3162\u001b[0m \u001b[38;5;28misinstance\u001b[39m(key, \u001b[38;5;28mstr\u001b[39m)\n\u001b[1;32m 3163\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlevels[level]\u001b[38;5;241m.\u001b[39m_supports_partial_string_indexing\n\u001b[1;32m 3164\u001b[0m ):\n\u001b[1;32m 3165\u001b[0m \u001b[38;5;66;03m# check to see if we did an exact lookup vs sliced\u001b[39;00m\n\u001b[1;32m 3166\u001b[0m check \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlevels[level]\u001b[38;5;241m.\u001b[39mget_loc(key)\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexes/multi.py:3263\u001b[0m, in \u001b[0;36mMultiIndex._get_level_indexer\u001b[0;34m(self, key, level, indexer)\u001b[0m\n\u001b[1;32m 3259\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mslice\u001b[39m(i, j, step)\n\u001b[1;32m 3261\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 3263\u001b[0m idx \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_loc_single_level_index\u001b[49m\u001b[43m(\u001b[49m\u001b[43mlevel_index\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mkey\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 3265\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m level \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m0\u001b[39m \u001b[38;5;129;01mor\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_lexsort_depth \u001b[38;5;241m==\u001b[39m \u001b[38;5;241m0\u001b[39m:\n\u001b[1;32m 3266\u001b[0m \u001b[38;5;66;03m# Desired level is not sorted\u001b[39;00m\n\u001b[1;32m 3267\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(idx, \u001b[38;5;28mslice\u001b[39m):\n\u001b[1;32m 3268\u001b[0m \u001b[38;5;66;03m# test_get_loc_partial_timestamp_multiindex\u001b[39;00m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexes/multi.py:2849\u001b[0m, in \u001b[0;36mMultiIndex._get_loc_single_level_index\u001b[0;34m(self, level_index, key)\u001b[0m\n\u001b[1;32m 2847\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m\n\u001b[1;32m 2848\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 2849\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mlevel_index\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget_loc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/.pyenv/versions/my3110/lib/python3.11/site-packages/pandas/core/indexes/base.py:3804\u001b[0m, in \u001b[0;36mIndex.get_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m 3802\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_engine\u001b[38;5;241m.\u001b[39mget_loc(casted_key)\n\u001b[1;32m 3803\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n\u001b[0;32m-> 3804\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m(key) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01merr\u001b[39;00m\n\u001b[1;32m 3805\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m 3806\u001b[0m \u001b[38;5;66;03m# If we have a listlike key, _check_indexing_error will raise\u001b[39;00m\n\u001b[1;32m 3807\u001b[0m \u001b[38;5;66;03m# InvalidIndexError. Otherwise we fall through and re-raise\u001b[39;00m\n\u001b[1;32m 3808\u001b[0m \u001b[38;5;66;03m# the TypeError.\u001b[39;00m\n\u001b[1;32m 3809\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_check_indexing_error(key)\n", "\u001b[0;31mKeyError\u001b[0m: 'Mao'" ] } ], "source": [ "dftc.loc[\"Mao\"] # dependent index ??? COM ES POT ACCEDIR?" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Unnamed: 0 528.918919\n", "CP 7701.864865\n", "Punts 47.027027\n", "Name: (A, Mao), dtype: float64" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dftc.loc[(\"A\",\"Mao\")]" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0CPPunts
Tipus certificat catCiutat
AMao528.9189197701.86486547.027027
BMao497.0408167701.77551050.673469
CMao433.4285717701.85714352.428571
DMao467.0000007703.00000020.000000
\n", "
" ], "text/plain": [ " Unnamed: 0 CP Punts\n", "Tipus certificat cat Ciutat \n", "A Mao 528.918919 7701.864865 47.027027\n", "B Mao 497.040816 7701.775510 50.673469\n", "C Mao 433.428571 7701.857143 52.428571\n", "D Mao 467.000000 7703.000000 20.000000" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Una manera més elegant és:\n", "dftc.loc[pd.IndexSlice[:, 'Mao'],:] # Quins parametres n'hi ha? LOC i IndexSlice\n", "# https://pandas.pydata.org/docs/reference/api/pandas.IndexSlice.html" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Tipus certificat cat Ciutat\n", "B Mao 50.673469\n", "C Mao 52.428571\n", "Name: Punts, dtype: float64" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dfBCMao = dftc.loc[pd.IndexSlice[[\"B\",\"C\"], 'Mao'],\"Punts\"]\n", "dfBCMao" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Una altra opció és `aplanar` o `esborrar` l'índex" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Tipus certificat cat Ciutat Unnamed: 0 CP Punts\n", "0 A Alaro 447.929825 7340.000000 50.789474\n", "1 A Ariany 547.203704 7529.000000 52.537037\n", "2 A Bellavista 499.148936 7609.000000 46.787234\n", "3 A Binissalem 535.375000 7350.000000 44.325000\n", "4 A Eivissa 504.931818 7800.000000 52.227273\n", "5 A La Savina 468.194444 7870.000000 47.944444\n", "6 A Mao 528.918919 7701.864865 47.027027\n", "7 A Palma 471.918367 7009.469388 51.081633\n", "8 A Pedruscada 475.347826 7590.000000 56.847826\n", "9 A Ruberts 503.487179 7511.000000 49.128205\n", "10 B Alaro 485.333333 7340.000000 52.452381\n", "11 B Ariany 496.404762 7529.000000 49.166667\n", "12 B Bellavista 540.851852 7609.000000 44.185185\n", "13 B Binissalem 560.909091 7350.000000 52.787879\n", "14 B Eivissa 525.484848 7800.000000 47.393939\n", "15 B La Savina 524.470588 7870.000000 44.441176\n", "16 B Mao 497.040816 7701.775510 50.673469\n", "17 B Palma 493.062500 7008.625000 55.375000\n", "18 B Pedruscada 486.606061 7590.000000 50.484848\n", "19 B Ruberts 391.956522 7511.000000 47.391304\n", "20 C Alaro 451.625000 7340.000000 39.687500\n", "21 C Ariany 606.583333 7529.000000 41.833333\n", "22 C Bellavista 503.111111 7609.000000 37.500000\n", "23 C Binissalem 478.058824 7350.000000 67.823529\n", "24 C Eivissa 525.777778 7800.000000 41.444444\n", "25 C La Savina 508.357143 7870.000000 52.500000\n", "26 C Mao 433.428571 7701.857143 52.428571\n", "27 C Palma 371.333333 7011.666667 33.083333\n", "28 C Pedruscada 554.000000 7590.000000 53.166667\n", "29 C Ruberts 541.823529 7511.000000 51.235294\n", "30 D Alaro 203.200000 7340.000000 66.800000\n", "31 D Ariany 446.600000 7529.000000 47.200000\n", "32 D Bellavista 728.833333 7609.000000 56.333333\n", "33 D Binissalem 470.000000 7350.000000 67.666667\n", "34 D Eivissa 581.800000 7800.000000 35.800000\n", "35 D La Savina 455.333333 7870.000000 43.000000\n", "36 D Mao 467.000000 7703.000000 20.000000\n", "37 D Palma 613.666667 7008.000000 30.333333\n", "38 D Pedruscada 496.777778 7590.000000 66.777778\n", "39 D Ruberts 636.833333 7511.000000 40.000000\n" ] } ], "source": [ "dftc = df.groupby([\"Tipus certificat cat\",\"Ciutat\"]).mean(numeric_only = True).reset_index()\n", "print(dftc)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Tipus certificat cat Ciutat Unnamed: 0 CP Punts\n", "6 A Mao 528.918919 7701.864865 47.027027\n", "16 B Mao 497.040816 7701.775510 50.673469\n", "26 C Mao 433.428571 7701.857143 52.428571\n", "36 D Mao 467.000000 7703.000000 20.000000\n" ] } ], "source": [ "print(dftc[dftc[\"Ciutat\"]==\"Mao\"])" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6 47.027027\n", "16 50.673469\n", "26 52.428571\n", "36 20.000000\n", "Name: Punts, dtype: float64\n" ] } ], "source": [ "print(dftc[dftc[\"Ciutat\"]==\"Mao\"].Punts) # Quina informació trobeu?\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Activitat 4.3" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A) Agrupa els segënts dades per el nom de la escola.\n", "- Quina escola té més nins?\n", "- Quina escola té els nins més alts?" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
schoolclassnamedate_Of_Birthageheightweightaddress
S1s001VAlberto Franco15/05/20021217335street1
S2s002VGino Mcneill17/05/20021219232street2
S3s003VIRyan Parkes16/02/19991318633street3
S4s001VIEesha Hinton25/09/19981316730street1
S5s002VGino Mcneill11/05/20021415131street2
\n", "
" ], "text/plain": [ " school class name date_Of_Birth age height weight address\n", "S1 s001 V Alberto Franco 15/05/2002 12 173 35 street1\n", "S2 s002 V Gino Mcneill 17/05/2002 12 192 32 street2\n", "S3 s003 VI Ryan Parkes 16/02/1999 13 186 33 street3\n", "S4 s001 VI Eesha Hinton 25/09/1998 13 167 30 street1\n", "S5 s002 V Gino Mcneill 11/05/2002 14 151 31 street2" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = {\n", " 'school': ['s001', 's002', 's003', 's001', 's002', 's004'],\n", " 'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],\n", " 'name': ['Alberto Franco', 'Gino Mcneill', 'Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David Parkes'],\n", " 'date_Of_Birth': ['15/05/2002', '17/05/2002', '16/02/1999', '25/09/1998', '11/05/2002', '15/09/1997'],\n", " 'age': [12, 12, 13, 13, 14, 12],\n", " 'height': [173, 192, 186, 167, 151, 159],\n", " 'weight': [35, 32, 33, 30, 31, 32],\n", " 'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']\n", "}\n", "\n", "df = pd.DataFrame(data, index=['S1', 'S2', 'S3', 'S4','S5','S6'])\n", "df.head()\n" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A) Donat el següent llistat de vendes:\n", "- Quin comprador ha gastat més?\n", "- Quin vendedor ha fet més vendes?" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Producto Precio Comprador Vendedor\n", "0 Producto 1 65 Juan Laura\n", "1 Producto 2 43 Pedro Laura\n", "2 Producto 3 30 Maria Laura\n", "3 Producto 4 41 Ana Elena\n", "4 Producto 5 29 Juan Laura\n", "5 Producto 6 46 Ana Elena\n", "6 Producto 7 54 Ana Miguel\n", "7 Producto 8 36 Juan Carlos\n", "8 Producto 9 17 Juan Elena\n", "9 Producto 10 59 Maria Elena\n", "10 Producto 11 47 Maria Elena\n", "11 Producto 12 49 Pedro Carlos\n", "12 Producto 13 65 Maria Carlos\n", "13 Producto 14 42 Juan Carlos\n", "14 Producto 15 10 Pedro Miguel\n" ] } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "# Crear datos aleatorios\n", "n = 15\n", "productos = ['Producto ' + str(i) for i in range(1, n+1)]\n", "precios = np.random.randint(10, 100, n)\n", "compradores = np.random.choice(['Juan', 'Pedro', 'Maria', 'Ana'], n)\n", "vendedores = np.random.choice(['Carlos', 'Laura', 'Miguel', 'Elena'], n)\n", "\n", "# Crear DataFrame\n", "df_ventas = pd.DataFrame({\n", " 'Producto': productos,\n", " 'Precio': precios,\n", " 'Comprador': compradores,\n", " 'Vendedor': vendedores\n", "})\n", "\n", "print(df_ventas)\n" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A) Usant el fitxer 1/data/WHO.csv, Quin és el volum total de CO2 emès per cada continent? Qui continent emet més CO2?" ] }, { "cell_type": "code", "execution_count": 198, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "B) Usant el fitxer 1/data/WHO.csv, Quants països hi ha per continent? Quin continent té més països?" ] }, { "cell_type": "code", "execution_count": 197, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "C) Usant el fitxer 1/data/presupuesto_gastos_2023.csv, Calcula la despesa total per \"SAILAREN DESKRIBAPENA_EU/DESCRIPCION DEPARTAMENT_CAS\" ? Ordena el resultats per despesa (descendent)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "D) Fitxer \"data/data_groups_cursos.csv\", El fitxer conté cursos realitzats per persones. Cada mostra correspon a un curs.\n", "Es demana obtenir un llistat de número de cursos realitzat per cada persona, una valoració de punts segons el curs de la següent manera:\n", "- 'A': 3 punts\n", "- 'B': 2 punts\n", "- 'C': 1 punt\n", "- 'D': 0.5 punt\n", "- \"E\": 0.5 punt\n", "- \"F\": 1 punt\n", " " ] }, { "cell_type": "code", "execution_count": 199, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "D.2 (amb el anterior cas D), tonar a calcula la valoració de punts considerant la combinació de cursos (independentment del seu ordre):\n", "- A, B i C : +10 punts\n", "- A i F: +3 punts\n", "- E i F: -10 punts\n", "- C i F: +5 punts" ] }, { "cell_type": "code", "execution_count": 201, "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sèries temporals" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les sèries temporals són mostres de valors preses al llarg d'un temps amb un mostreig generalment equidistant. Per exemple, informació econòmica, demografia, meteorològica; registres de seguretat, activitat, etc.\n", "\n", "La llibreria Pandas gestiona les sèries temporals usant l'índex: una data (*datetime*,*timedelta*):\n", "\n", "L'índex d'un dataframe és el pilar bàsic d'accés als valors, per la qual cosa el seu ús simplifica processos de filtratge, selecció, interpolació, etc.\n", "\n", "Enllaç a la documentació: [TimeSeries](https://pandas.pydata.org/docs/user_guide/timeseries.html)\n", "\n", "En aquesta secció començarem a treballar amb sèries temporals, començarem inspeccionant un conjunt de dades:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datetemperaturemintemperaturemaxprecipitationsnowfallsnowdepthavgwindspeedfastest2minwinddirfastest2minwindspeedfastest5secwinddir...drizzlesnowfreezingrainsmokehazethunderhighwindhailblowingsnowdustfreezingfog
45522019-05-2468.093.00.000.00.05.82280.014.09330.0...NoNoNoYesYesNoNoNoNoNo
45532019-05-2672.093.90.050.00.05.82240.017.00240.0...NoNoNoNoYesNoNoNoNoNo
45542019-06-0871.179.00.320.00.09.40100.018.1280.0...NoNoNoNoYesNoNoNoNoNo
45552019-05-0660.177.00.000.00.05.8240.014.0950.0...NoNoNoNoNoNoNoNoNoNo
45562019-05-1264.975.90.000.00.011.63230.021.92220.0...NoNoNoNoNoNoNoNoNoNo
\n", "

5 rows × 28 columns

\n", "
" ], "text/plain": [ " date temperaturemin temperaturemax precipitation snowfall \\\n", "4552 2019-05-24 68.0 93.0 0.00 0.0 \n", "4553 2019-05-26 72.0 93.9 0.05 0.0 \n", "4554 2019-06-08 71.1 79.0 0.32 0.0 \n", "4555 2019-05-06 60.1 77.0 0.00 0.0 \n", "4556 2019-05-12 64.9 75.9 0.00 0.0 \n", "\n", " snowdepth avgwindspeed fastest2minwinddir fastest2minwindspeed \\\n", "4552 0.0 5.82 280.0 14.09 \n", "4553 0.0 5.82 240.0 17.00 \n", "4554 0.0 9.40 100.0 18.12 \n", "4555 0.0 5.82 40.0 14.09 \n", "4556 0.0 11.63 230.0 21.92 \n", "\n", " fastest5secwinddir ... drizzle snow freezingrain smokehaze thunder \\\n", "4552 330.0 ... No No No Yes Yes \n", "4553 240.0 ... No No No No Yes \n", "4554 80.0 ... No No No No Yes \n", "4555 50.0 ... No No No No No \n", "4556 220.0 ... No No No No No \n", "\n", " highwind hail blowingsnow dust freezingfog \n", "4552 No No No No No \n", "4553 No No No No No \n", "4554 No No No No No \n", "4555 No No No No No \n", "4556 No No No No No \n", "\n", "[5 rows x 28 columns]" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.read_csv(\"data/rdu-weather-history.csv\",sep=\";\") \n", "\n", "df.tail()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['date', 'temperaturemin', 'temperaturemax', 'precipitation', 'snowfall', 'snowdepth', 'avgwindspeed', 'fastest2minwinddir', 'fastest2minwindspeed', 'fastest5secwinddir', 'fastest5secwindspeed', 'fog', 'fogheavy', 'mist', 'rain', 'fogground', 'ice', 'glaze', 'drizzle', 'snow', 'freezingrain', 'smokehaze', 'thunder', 'highwind', 'hail', 'blowingsnow', 'dust', 'freezingfog']\n" ] } ], "source": [ "print(list(df.columns))" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datetemperaturemintemperaturemaxprecipitationsnowfallsnowdepthavgwindspeedfastest2minwinddirfastest2minwindspeedfastest5secwinddir...drizzlesnowfreezingrainsmokehazethunderhighwindhailblowingsnowdustfreezingfog
date
2015-04-082015-04-0862.184.00.000.00.05.8240.029.9730.0...NoNoNoYesNoNoNoNoNoNo
2015-04-202015-04-2063.078.10.280.00.011.86180.021.92170.0...NoNoNoNoYesNoNoNoNoNo
2015-04-262015-04-2645.054.00.020.00.05.8250.012.9740.0...NoNoNoNoNoNoNoNoNoNo
2015-04-282015-04-2839.069.10.000.00.02.6840.012.0840.0...NoNoNoNoNoNoNoNoNoNo
2015-05-032015-05-0346.979.00.000.00.02.68200.012.08210.0...NoNoNoNoNoNoNoNoNoNo
\n", "

5 rows × 28 columns

\n", "
" ], "text/plain": [ " date temperaturemin temperaturemax precipitation \\\n", "date \n", "2015-04-08 2015-04-08 62.1 84.0 0.00 \n", "2015-04-20 2015-04-20 63.0 78.1 0.28 \n", "2015-04-26 2015-04-26 45.0 54.0 0.02 \n", "2015-04-28 2015-04-28 39.0 69.1 0.00 \n", "2015-05-03 2015-05-03 46.9 79.0 0.00 \n", "\n", " snowfall snowdepth avgwindspeed fastest2minwinddir \\\n", "date \n", "2015-04-08 0.0 0.0 5.82 40.0 \n", "2015-04-20 0.0 0.0 11.86 180.0 \n", "2015-04-26 0.0 0.0 5.82 50.0 \n", "2015-04-28 0.0 0.0 2.68 40.0 \n", "2015-05-03 0.0 0.0 2.68 200.0 \n", "\n", " fastest2minwindspeed fastest5secwinddir ... drizzle snow \\\n", "date ... \n", "2015-04-08 29.97 30.0 ... No No \n", "2015-04-20 21.92 170.0 ... No No \n", "2015-04-26 12.97 40.0 ... No No \n", "2015-04-28 12.08 40.0 ... No No \n", "2015-05-03 12.08 210.0 ... No No \n", "\n", " freezingrain smokehaze thunder highwind hail blowingsnow dust \\\n", "date \n", "2015-04-08 No Yes No No No No No \n", "2015-04-20 No No Yes No No No No \n", "2015-04-26 No No No No No No No \n", "2015-04-28 No No No No No No No \n", "2015-05-03 No No No No No No No \n", "\n", " freezingfog \n", "date \n", "2015-04-08 No \n", "2015-04-20 No \n", "2015-04-26 No \n", "2015-04-28 No \n", "2015-05-03 No \n", "\n", "[5 rows x 28 columns]" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pandas import DatetimeIndex\n", "\n", "df.index = DatetimeIndex(df[\"date\"]) # una sèrie de strings\n", "## Aquesta transformació és \"una mica intel·ligent\": \"03-12-1920\" \"12-03-1920\" \"1920-12-03\"... \n", "\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Int64Index([2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2015,\n", " ...\n", " 2011, 2011, 2011, 2011, 2019, 2019, 2019, 2019, 2019, 2019],\n", " dtype='int64', name='date', length=4557)" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index.year " ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datetemperaturemintemperaturemaxprecipitationsnowfallsnowdepthavgwindspeedfastest2minwinddirfastest2minwindspeedfastest5secwinddir...drizzlesnowfreezingrainsmokehazethunderhighwindhailblowingsnowdustfreezingfog
date
2015-12-032015-12-0339.057.90.000.00.04.25310.012.97340.0...NoNoNoNoNoNoNoNoNoNo
2015-12-042015-12-0433.154.00.000.00.02.0130.010.0730.0...NoNoNoNoNoNoNoNoNoNo
2015-12-092015-12-0942.164.00.000.00.04.70230.016.11230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-102015-12-1044.166.90.000.00.03.58220.012.08220.0...NoNoNoNoNoNoNoNoNoNo
2015-12-182015-12-1837.059.00.160.00.06.26300.016.11310.0...NoNoNoNoNoNoNoNoNoNo
2015-12-242015-12-2466.077.00.040.00.012.75220.023.04200.0...NoNoNoNoNoNoNoNoNoNo
2015-12-252015-12-2564.973.90.000.00.08.05230.017.00230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-262015-12-2663.073.90.010.00.02.91190.08.95210.0...NoNoNoNoNoNoNoNoNoNo
2015-12-022015-12-0252.064.00.040.00.07.61170.017.00180.0...NoNoNoNoNoNoNoNoNoNo
2015-12-172015-12-1751.159.00.890.00.02.01300.014.09310.0...NoNoNoNoNoNoNoNoNoNo
2015-12-192015-12-1926.248.00.000.00.06.71240.021.03240.0...NoNoNoNoNoNoNoNoNoNo
2015-12-212015-12-2136.057.00.000.00.02.01200.06.93200.0...NoNoNoNoNoNoNoNoNoNo
2015-12-222015-12-2251.164.90.660.00.04.92160.016.11180.0...NoNoNoNoNoNoNoNoNoNo
2015-12-282015-12-2851.168.00.150.00.08.9580.021.9280.0...NoNoNoNoNoNoNoNoNoNo
2015-12-012015-12-0143.053.10.040.00.02.01290.06.93150.0...NoNoNoNoNoNoNoNoNoNo
2015-12-082015-12-0836.061.00.000.00.02.24130.010.07130.0...NoNoNoNoNoNoNoNoNoNo
2015-12-112015-12-1150.071.10.000.00.06.26230.012.97210.0...NoNoNoNoNoNoNoNoNoNo
2015-12-152015-12-1548.072.00.000.00.05.59240.018.12240.0...NoNoNoNoNoNoNoNoNoNo
2015-12-162015-12-1643.064.90.000.00.04.70100.012.08120.0...NoNoNoNoNoNoNoNoNoNo
2015-12-202015-12-2027.154.00.000.00.02.01160.08.05160.0...NoNoNoNoNoNoNoNoNoNo
2015-12-232015-12-2361.073.01.680.00.09.17230.029.08220.0...NoNoNoNoYesNoNoNoNoNo
2015-12-302015-12-3063.066.91.790.00.05.14290.014.99290.0...NoNoNoNoYesNoNoNoNoNo
2015-12-122015-12-1253.175.00.000.00.04.47230.012.08230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-272015-12-2764.075.90.000.00.09.40230.021.92230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-292015-12-2953.173.90.100.00.04.25220.017.00230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-052015-12-0529.155.90.000.00.03.8050.014.9930.0...NoNoNoNoNoNoNoNoNoNo
2015-12-062015-12-0629.160.10.000.00.01.12340.06.04330.0...NoNoNoNoNoNoNoNoNoNo
2015-12-072015-12-0741.057.90.000.00.01.34220.06.93140.0...NoNoNoNoNoNoNoNoNoNo
2015-12-132015-12-1350.073.00.000.00.02.91220.010.07210.0...NoNoNoNoNoNoNoNoNoNo
2015-12-142015-12-1462.175.00.040.00.010.74230.023.04230.0...NoNoNoNoNoNoNoNoNoNo
2015-12-312015-12-3151.166.00.470.00.04.47220.014.09200.0...NoNoNoNoNoNoNoNoNoNo
\n", "

31 rows × 28 columns

\n", "
" ], "text/plain": [ " date temperaturemin temperaturemax precipitation \\\n", "date \n", "2015-12-03 2015-12-03 39.0 57.9 0.00 \n", "2015-12-04 2015-12-04 33.1 54.0 0.00 \n", "2015-12-09 2015-12-09 42.1 64.0 0.00 \n", "2015-12-10 2015-12-10 44.1 66.9 0.00 \n", "2015-12-18 2015-12-18 37.0 59.0 0.16 \n", "2015-12-24 2015-12-24 66.0 77.0 0.04 \n", "2015-12-25 2015-12-25 64.9 73.9 0.00 \n", "2015-12-26 2015-12-26 63.0 73.9 0.01 \n", "2015-12-02 2015-12-02 52.0 64.0 0.04 \n", "2015-12-17 2015-12-17 51.1 59.0 0.89 \n", "2015-12-19 2015-12-19 26.2 48.0 0.00 \n", "2015-12-21 2015-12-21 36.0 57.0 0.00 \n", "2015-12-22 2015-12-22 51.1 64.9 0.66 \n", "2015-12-28 2015-12-28 51.1 68.0 0.15 \n", "2015-12-01 2015-12-01 43.0 53.1 0.04 \n", "2015-12-08 2015-12-08 36.0 61.0 0.00 \n", "2015-12-11 2015-12-11 50.0 71.1 0.00 \n", "2015-12-15 2015-12-15 48.0 72.0 0.00 \n", "2015-12-16 2015-12-16 43.0 64.9 0.00 \n", "2015-12-20 2015-12-20 27.1 54.0 0.00 \n", "2015-12-23 2015-12-23 61.0 73.0 1.68 \n", "2015-12-30 2015-12-30 63.0 66.9 1.79 \n", "2015-12-12 2015-12-12 53.1 75.0 0.00 \n", "2015-12-27 2015-12-27 64.0 75.9 0.00 \n", "2015-12-29 2015-12-29 53.1 73.9 0.10 \n", "2015-12-05 2015-12-05 29.1 55.9 0.00 \n", "2015-12-06 2015-12-06 29.1 60.1 0.00 \n", "2015-12-07 2015-12-07 41.0 57.9 0.00 \n", "2015-12-13 2015-12-13 50.0 73.0 0.00 \n", "2015-12-14 2015-12-14 62.1 75.0 0.04 \n", "2015-12-31 2015-12-31 51.1 66.0 0.47 \n", "\n", " snowfall snowdepth avgwindspeed fastest2minwinddir \\\n", "date \n", "2015-12-03 0.0 0.0 4.25 310.0 \n", "2015-12-04 0.0 0.0 2.01 30.0 \n", "2015-12-09 0.0 0.0 4.70 230.0 \n", "2015-12-10 0.0 0.0 3.58 220.0 \n", "2015-12-18 0.0 0.0 6.26 300.0 \n", "2015-12-24 0.0 0.0 12.75 220.0 \n", "2015-12-25 0.0 0.0 8.05 230.0 \n", "2015-12-26 0.0 0.0 2.91 190.0 \n", "2015-12-02 0.0 0.0 7.61 170.0 \n", "2015-12-17 0.0 0.0 2.01 300.0 \n", "2015-12-19 0.0 0.0 6.71 240.0 \n", "2015-12-21 0.0 0.0 2.01 200.0 \n", "2015-12-22 0.0 0.0 4.92 160.0 \n", "2015-12-28 0.0 0.0 8.95 80.0 \n", "2015-12-01 0.0 0.0 2.01 290.0 \n", "2015-12-08 0.0 0.0 2.24 130.0 \n", "2015-12-11 0.0 0.0 6.26 230.0 \n", "2015-12-15 0.0 0.0 5.59 240.0 \n", "2015-12-16 0.0 0.0 4.70 100.0 \n", "2015-12-20 0.0 0.0 2.01 160.0 \n", "2015-12-23 0.0 0.0 9.17 230.0 \n", "2015-12-30 0.0 0.0 5.14 290.0 \n", "2015-12-12 0.0 0.0 4.47 230.0 \n", "2015-12-27 0.0 0.0 9.40 230.0 \n", "2015-12-29 0.0 0.0 4.25 220.0 \n", "2015-12-05 0.0 0.0 3.80 50.0 \n", "2015-12-06 0.0 0.0 1.12 340.0 \n", "2015-12-07 0.0 0.0 1.34 220.0 \n", "2015-12-13 0.0 0.0 2.91 220.0 \n", "2015-12-14 0.0 0.0 10.74 230.0 \n", "2015-12-31 0.0 0.0 4.47 220.0 \n", "\n", " fastest2minwindspeed fastest5secwinddir ... drizzle snow \\\n", "date ... \n", "2015-12-03 12.97 340.0 ... No No \n", "2015-12-04 10.07 30.0 ... No No \n", "2015-12-09 16.11 230.0 ... No No \n", "2015-12-10 12.08 220.0 ... No No \n", "2015-12-18 16.11 310.0 ... No No \n", "2015-12-24 23.04 200.0 ... No No \n", "2015-12-25 17.00 230.0 ... No No \n", "2015-12-26 8.95 210.0 ... No No \n", "2015-12-02 17.00 180.0 ... No No \n", "2015-12-17 14.09 310.0 ... No No \n", "2015-12-19 21.03 240.0 ... No No \n", "2015-12-21 6.93 200.0 ... No No \n", "2015-12-22 16.11 180.0 ... No No \n", "2015-12-28 21.92 80.0 ... No No \n", "2015-12-01 6.93 150.0 ... No No \n", "2015-12-08 10.07 130.0 ... No No \n", "2015-12-11 12.97 210.0 ... No No \n", "2015-12-15 18.12 240.0 ... No No \n", "2015-12-16 12.08 120.0 ... No No \n", "2015-12-20 8.05 160.0 ... No No \n", "2015-12-23 29.08 220.0 ... No No \n", "2015-12-30 14.99 290.0 ... No No \n", "2015-12-12 12.08 230.0 ... No No \n", "2015-12-27 21.92 230.0 ... No No \n", "2015-12-29 17.00 230.0 ... No No \n", "2015-12-05 14.99 30.0 ... No No \n", "2015-12-06 6.04 330.0 ... No No \n", "2015-12-07 6.93 140.0 ... No No \n", "2015-12-13 10.07 210.0 ... No No \n", "2015-12-14 23.04 230.0 ... No No \n", "2015-12-31 14.09 200.0 ... No No \n", "\n", " freezingrain smokehaze thunder highwind hail blowingsnow dust \\\n", "date \n", "2015-12-03 No No No No No No No \n", "2015-12-04 No No No No No No No \n", "2015-12-09 No No No No No No No \n", "2015-12-10 No No No No No No No \n", "2015-12-18 No No No No No No No \n", "2015-12-24 No No No No No No No \n", "2015-12-25 No No No No No No No \n", "2015-12-26 No No No No No No No \n", "2015-12-02 No No No No No No No \n", "2015-12-17 No No No No No No No \n", "2015-12-19 No No No No No No No \n", "2015-12-21 No No No No No No No \n", "2015-12-22 No No No No No No No \n", "2015-12-28 No No No No No No No \n", "2015-12-01 No No No No No No No \n", "2015-12-08 No No No No No No No \n", "2015-12-11 No No No No No No No \n", "2015-12-15 No No No No No No No \n", "2015-12-16 No No No No No No No \n", "2015-12-20 No No No No No No No \n", "2015-12-23 No No Yes No No No No \n", "2015-12-30 No No Yes No No No No \n", "2015-12-12 No No No No No No No \n", "2015-12-27 No No No No No No No \n", "2015-12-29 No No No No No No No \n", "2015-12-05 No No No No No No No \n", "2015-12-06 No No No No No No No \n", "2015-12-07 No No No No No No No \n", "2015-12-13 No No No No No No No \n", "2015-12-14 No No No No No No No \n", "2015-12-31 No No No No No No No \n", "\n", " freezingfog \n", "date \n", "2015-12-03 No \n", "2015-12-04 No \n", "2015-12-09 No \n", "2015-12-10 No \n", "2015-12-18 No \n", "2015-12-24 No \n", "2015-12-25 No \n", "2015-12-26 No \n", "2015-12-02 No \n", "2015-12-17 No \n", "2015-12-19 No \n", "2015-12-21 No \n", "2015-12-22 No \n", "2015-12-28 No \n", "2015-12-01 No \n", "2015-12-08 No \n", "2015-12-11 No \n", "2015-12-15 No \n", "2015-12-16 No \n", "2015-12-20 No \n", "2015-12-23 No \n", "2015-12-30 No \n", "2015-12-12 No \n", "2015-12-27 No \n", "2015-12-29 No \n", "2015-12-05 No \n", "2015-12-06 No \n", "2015-12-07 No \n", "2015-12-13 No \n", "2015-12-14 No \n", "2015-12-31 No \n", "\n", "[31 rows x 28 columns]" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# any\n", "df.loc[\"12/2015\"]" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"2015\"].temperaturemin.min() > df.loc[\"2016\"].temperaturemin.min()" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "date date\n", "2007 1 73.9\n", " 2 73.9\n", " 3 87.1\n", " 4 87.1\n", " 5 91.9\n", " ... \n", "2019 2 79.0\n", " 3 75.9\n", " 4 86.0\n", " 5 96.1\n", " 6 93.0\n", "Name: temperaturemax, Length: 150, dtype: float64\n" ] } ], "source": [ "# que estem fent aqui?\n", "dfmax = df[\"temperaturemax\"].groupby([df.index.year,df.index.month]).max()\n", "print(dfmax) #penseu l'index" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "[![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
\n", "Isaac Lera and Gabriel Moya
\n", "Universitat de les Illes Balears
\n", "isaac.lera@uib.edu, gabriel.moya@uib.edu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "my3110", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }