{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", " \n", "___" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# SF Salaries Exercise \n", "\n", "Welcome to a quick exercise for you to practice your pandas skills! We will be using the [SF Salaries Dataset](https://www.kaggle.com/kaggle/sf-salaries) from Kaggle! Just follow along and complete the tasks outlined in bold below. The tasks will get harder and harder as you go along." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Import pandas as pd.**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Read Salaries.csv as a dataframe called sal.**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Check the head of the DataFrame. **" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
01NATHANIEL FORDGENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY167411.180.00400184.25NaN567595.43567595.432011NaNSan FranciscoNaN
12GARY JIMENEZCAPTAIN III (POLICE DEPARTMENT)155966.02245131.88137811.38NaN538909.28538909.282011NaNSan FranciscoNaN
23ALBERT PARDINICAPTAIN III (POLICE DEPARTMENT)212739.13106088.1816452.60NaN335279.91335279.912011NaNSan FranciscoNaN
34CHRISTOPHER CHONGWIRE ROPE CABLE MAINTENANCE MECHANIC77916.0056120.71198306.90NaN332343.61332343.612011NaNSan FranciscoNaN
45PATRICK GARDNERDEPUTY CHIEF OF DEPARTMENT,(FIRE DEPARTMENT)134401.609737.00182234.59NaN326373.19326373.192011NaNSan FranciscoNaN
\n", "
" ], "text/plain": [ " Id EmployeeName JobTitle \\\n", "0 1 NATHANIEL FORD GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY \n", "1 2 GARY JIMENEZ CAPTAIN III (POLICE DEPARTMENT) \n", "2 3 ALBERT PARDINI CAPTAIN III (POLICE DEPARTMENT) \n", "3 4 CHRISTOPHER CHONG WIRE ROPE CABLE MAINTENANCE MECHANIC \n", "4 5 PATRICK GARDNER DEPUTY CHIEF OF DEPARTMENT,(FIRE DEPARTMENT) \n", "\n", " BasePay OvertimePay OtherPay Benefits TotalPay TotalPayBenefits \\\n", "0 167411.18 0.00 400184.25 NaN 567595.43 567595.43 \n", "1 155966.02 245131.88 137811.38 NaN 538909.28 538909.28 \n", "2 212739.13 106088.18 16452.60 NaN 335279.91 335279.91 \n", "3 77916.00 56120.71 198306.90 NaN 332343.61 332343.61 \n", "4 134401.60 9737.00 182234.59 NaN 326373.19 326373.19 \n", "\n", " Year Notes Agency Status \n", "0 2011 NaN San Francisco NaN \n", "1 2011 NaN San Francisco NaN \n", "2 2011 NaN San Francisco NaN \n", "3 2011 NaN San Francisco NaN \n", "4 2011 NaN San Francisco NaN " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Use the .info() method to find out how many entries there are.**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 148654 entries, 0 to 148653\n", "Data columns (total 13 columns):\n", "Id 148654 non-null int64\n", "EmployeeName 148654 non-null object\n", "JobTitle 148654 non-null object\n", "BasePay 148045 non-null float64\n", "OvertimePay 148650 non-null float64\n", "OtherPay 148650 non-null float64\n", "Benefits 112491 non-null float64\n", "TotalPay 148654 non-null float64\n", "TotalPayBenefits 148654 non-null float64\n", "Year 148654 non-null int64\n", "Notes 0 non-null float64\n", "Agency 148654 non-null object\n", "Status 0 non-null float64\n", "dtypes: float64(8), int64(2), object(3)\n", "memory usage: 14.7+ MB\n" ] } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**What is the average BasePay ?**" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "66325.44884050643" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What is the highest amount of OvertimePay in the dataset ? **" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "245131.88" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What is the job title of JOSEPH DRISCOLL ? Note: Use all caps, otherwise you may get an answer that doesn't match up (there is also a lowercase Joseph Driscoll). **" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "24 CAPTAIN, FIRE SUPPRESSION\n", "Name: JobTitle, dtype: object" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** How much does JOSEPH DRISCOLL make (including benefits)? **" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "24 270324.91\n", "Name: TotalPayBenefits, dtype: float64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What is the name of highest paid person (including benefits)?**" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
01NATHANIEL FORDGENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY167411.180.0400184.25NaN567595.43567595.432011NaNSan FranciscoNaN
\n", "
" ], "text/plain": [ " Id EmployeeName JobTitle \\\n", "0 1 NATHANIEL FORD GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY \n", "\n", " BasePay OvertimePay OtherPay Benefits TotalPay TotalPayBenefits \\\n", "0 167411.18 0.0 400184.25 NaN 567595.43 567595.43 \n", "\n", " Year Notes Agency Status \n", "0 2011 NaN San Francisco NaN " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What is the name of lowest paid person (including benefits)? Do you notice something strange about how much he or she is paid?**" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdEmployeeNameJobTitleBasePayOvertimePayOtherPayBenefitsTotalPayTotalPayBenefitsYearNotesAgencyStatus
148653148654Joe LopezCounselor, Log Cabin Ranch0.00.0-618.130.0-618.13-618.132014NaNSan FranciscoNaN
\n", "
" ], "text/plain": [ " Id EmployeeName JobTitle BasePay OvertimePay \\\n", "148653 148654 Joe Lopez Counselor, Log Cabin Ranch 0.0 0.0 \n", "\n", " OtherPay Benefits TotalPay TotalPayBenefits Year Notes \\\n", "148653 -618.13 0.0 -618.13 -618.13 2014 NaN \n", "\n", " Agency Status \n", "148653 San Francisco NaN " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What was the average (mean) BasePay of all employees per year? (2011-2014) ? **" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Year\n", "2011 63595.956517\n", "2012 65436.406857\n", "2013 69630.030216\n", "2014 66564.421924\n", "Name: BasePay, dtype: float64" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** How many unique job titles are there? **" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "2159" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** What are the top 5 most common jobs? **" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Transit Operator 7036\n", "Special Nurse 4389\n", "Registered Nurse 3736\n", "Public Svc Aide-Public Works 2518\n", "Police Officer 3 2421\n", "Name: JobTitle, dtype: int64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** How many Job Titles were represented by only one person in 2013? (e.g. Job Titles with only one occurence in 2013?) **" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "202" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** How many people have the word Chief in their job title? (This is pretty tricky) **" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "477" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** Bonus: Is there a correlation between length of the Job Title string and Salary? **" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
title_lenTotalPayBenefits
title_len1.000000-0.036878
TotalPayBenefits-0.0368781.000000
\n", "
" ], "text/plain": [ " title_len TotalPayBenefits\n", "title_len 1.000000 -0.036878\n", "TotalPayBenefits -0.036878 1.000000" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Great Job!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }