Exercise
Pivot Tables in Google Sheets
Summary
1. Module: Understanding Data
1
2. Objective: Learn about summarising data using pivot tables in Google Sheets
3. Time Allotment: 2 hours
Steps
Conceptual Exercise
Let’s start off by looking at the data table we prepared:
Participant’s
Name
Organisation
Type of
Organisation
(Media, CSO,
Academic, Think
Tank, Political)
Number of rows
in your dataset
Number of
columns in your
dataset
Maung Than
Chaung
MIMU
CSO
3000
10
Ma Thay Mhyin
7Day
Media
300
20
...
...
...
...
...
Let’s say we want to find out how many participants were from CSOs. In the previous exercise, we
looked at how Google Sheets’ filter function can be used to filter just for CSO participants and
then we use COUNT to calculate how many participants are from CSOs. Now let’s say we want to
1
This lesson was adapted from the World Bank's Introduction to Data Literacy training manual by Eva Constantaras,
and adapted by Yan Naung Oak, Open Development Cambodia and Open Development Initiative, and is licensed
under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. For full terms of use,
see here.
1
repeat this for Media, Think Tanks and other types of organisations. If we filter for each type one
at a time, it can get very tedious and time consuming.
Wouldn’t it be nice if there was a to automatically calculate the number of participants from each
type of organisation, so that we can get the results in a table like below:
Type of Organisation
Count of participants
Media
5
CSO
8
Think Tank
4
The answer is “Yes!”. We can use the pivot table function in Google Sheets and other
spreadsheet software to get these kinds of useful summary calculations and much more.
For example, instead of counting the number of participants, we can automatically calculate the
sum of rows of data in the datasets for each type of organisation:
Type of Organisation
Total number of rows
Media
5,000
CSO
30,000
Think Tank
9,000
Pivot tables are one of the most useful features for data analysis.
First attempt to practice using the Pivot Table with this dataset of conference attendees.
1. Count of “participants” from each country
2. Count of attendance by country in August 2019
2
Main Exercise
Next, let’s move on to doing some data analysis on a familiar dataset. In this exercise, we will be
analysing data from the Hydropower Dams in Cambodia dataset from Open Development
Cambodia. We have prepared a more ready to use version of this dataset and it has been
uploaded to Google Sheets here. We will be looking at the questions below:
1. For each type of status (operational, potential site, under study, etc), calculate their
total combined capacity.
2. What is the total capacity for small scale projects?
3. What is the median capacity for all the projects?
4. What is the total capacity for large scale projects that are invested by Korea?
5. What province has the highest number of potential sites?
6. What province will have the largest total capacity if all the projects are completed?
7. What is the total capacity of projects that are operational by 2013 in Koh Kong
province?
Steps to Follow
Click on the top left corner of the spreadsheet table in between column A and row 1. This will
select the entire sheet.
3
Now, select Data -> Pivot table from the top menu bar.
4
Next click on “Create” in the pop-up that appears to create a new pivot table in a separate sheet.
5
On the new pivot table sheet, there is a panel on the right that lets you choose rows and columns
and filters to summarize by.
6
The first question asks “For each type of status (operational, potential site, under study, etc),
calculate their total combined capacity.
In order to answer this, we want to summarise the data according to the “status” column. In the
“Rows” section of the panel on the right, click on “Add”. Choose “status, and you will see that the
row labels in the pivot table will be populated with the various values of the status column.
7
Next, in the values section, click “Add” and choose “capacity”. By default, the “summarize by” will
be SUM, which means for each cell in the pivot table, it will sum the capacity values of all the
rows in the original data associated with the appropriate value for “status”, as shown below.
8
The next question asks “What is the total capacity for small scale projects?”
Similar to the previous question, we go back to our dataset, create a new pivot table, and choose
size_scale” for rows, and “capacity” for values.
9
The next question asks “The next question asks “What is the median capacity for all the
projects?”
Here, we do not need to choose any rows in the Pivot table editor panel on the right. We can go
straights to add Values. We choose “capacity” in the “Values” section and change the
“Summarize by” to MEDIAN.
10
The next question asks “What is the total capacity for large scale projects that are invested by
Korea?”
In this case, we want to choose both Rows and Columns to summarize with across two variables.
Choose “size_scale” for Rows, “country” for columns, and “capacity” for Values.
11
The next question asks “What province has the highest number of potential sites?”
Here we choose “province” for Rows, and “capacity” for Values, and change the “Summarize by”
to COUNTA
12
The next question asks “What province will have the largest total capacity if all the projects are
completed?”
Here we choose “province” for Rows, and “capacity” for Values, and change the “Summarize by”
to SUM
13
The next question asks “What is the total capacity of projects that are operational by 2013 in Koh
Kong province?”
Again we have to choose both Rows and Columns here to summarize with across two variables.
Choose “province” for Rows, “year” for columns, and “capacity” for Values.
14
Further Practice
Now we will practice the same skills for a dataset of private schools in Cambodia.
Do the following:
1. How many schools of each kind are there in each province?
2. Which provinces have schools with the highest median number of students?
3. Which provinces have schools with the highest median percentage of female students?
4. Which type of school has the highest median percentage of female sta?
5. Which province has schools with the lowest median student to staff ratio?
15