MetroTransit Usage Patterns - Who Is and Who Isn’t Using MetroTransit?

Introduction

Who Is and Who Isn’t Using MetroTransit?

We wanted to analyze how demographic patterns correlate with public transportation usage trends because we believe that there might be areas or demographics with higher demands but less accessibility to public transit or alternately, areas with access but people are not using it as much. From here, we hope that some of our findings will help maximize efficiency in terms of reallocating resources to areas with more traffic.

As we proceeded with our research, these were the questions that helped us form a cohesive path in terms of data gathering and visualization:

What is the correlation between income and ridership in the Twin Cities?
How do college students influence ridership?

Data Collection

Because we were looking at demographics, we wanted first to have a database containing the zip codes of the areas that we are researching. Data: Zip Code Databse
- Source: http://federalgovernmentzipcodes.us
- Measures: zip codes for all states in the United States
- Limitations: some of the formatting are not uniform so we had to clean up the data. Additionally, the longitudes and latitudes of each zip codes are those at the city-level
- Dimension: 81831x20
As we looked further into trends in public transportation usage, we came across this data on the Minnesota Compass on access to workplace. Data: Access to Workplace
- Source: http://www.mncompass.org/transportation/access-to-destinations#7-12147-d (The Minnesota Compass)
- Measures: the average number of jobs reachable within 30 minutes by public transit, walking, or cycling in the Twin Cities seven-county area
- Limitations: data is from 2014; the data was formatted in a very messy way and had to be cleaned up
- Dimension: 8x2
With the above data, we wanted to see with these numbers of jobs that are accessible by public transporation, walking, or cycling, how many people actually commute to work by these means and we referenced the census for this. Data: Means of Transportation to Work by Selected Characteristics in the Twin Cities seven-county area
- Source: https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_16_1YR_S0802&prodType=table (Census Bureau)
- Measures: the number and percentage of workers who use public transportation to commute in the Twin Cities seven-region area (we are using the data for 2014)
- Limitations: data is very complicated to extract and therefore our group had to construct the data frame from scratch
- Dimension: 101x64
We also searched for a source that would have information concerning income according to zip codes to analyze the relationship between income and ridership. Data: Median Income per Household according to Zip Codes
- Source: Zip Atlas
- Measures: the median income per household
- Limitations: data needs to be purchased in order to be downloaded
- Dimension: we referenced the figures for median income from the website to construct a data frame
With the second research question, we were working with locations and in order to map them, we required their longitudes and latitudes. Data: Location Information
- Source: Google Places API
- Measures: coordinates for nearby points of interest
- Limitations: only provides 20 results per search
- Dimensions: latitude and longitude for ~74 points of interest [74x3]
We incorporated our existing data with data from MetroTransit because we are looking at how these factors might influence ridership: Data: MetroTransit Stops
- Source: MetroTransit
- Measures: location of each bus stop
- Limitations: no stop-level usage data
- Dimensions: 14919x12
Data: MetroTransit Bus Stop Boardings
- Source: MetroTransit
- Measures: the number of daily boardings for each bus stop
- Limitations: none
- Dimensions: 13115944x6

Analysis

library(urltools)
library(jsonlite)
library(readr)
library(tidyverse)
library(ggmap)
library(ggplot2)
library(lubridate)
library(geosphere)
library(tigris)
library(leaflet)
library(htmltools)
library(reshape2)
options(tigris_use_cache = TRUE)

# loading the data on daily boarding and bus stops
stop_usage <- read_csv("daily_boarding.csv")
stops <- read_csv("bus_stops.csv")

Research Question 1: What is the Correlation between Ridership and Income in Saint Paul and Minneapolis?

First, we wanted to examine the trend in public transportation usage to determine whether public transit is a popular means of transportation among the population. We then gathered this data from the Minnesota Compass on the average number of jobs reachable within 30 minutes by public transit, walking, or cycling in the Twin Cities seven-county area, and we examined it to see how it could be incorporated into our research.

# getting the data
distance_county <- read.csv("http://api.mncompass.org/embed/b12115-2.csv", 
                           header=TRUE, sep=",", row.names=NULL)

# cleaning the data
shift <- function(x, n){
  c(x[-(seq(n))], rep(NA, n))}

distance_county <- transform(distance_county, Average.number.of.jobs.reachable..br...within.30.minutes.by.public.transit.or.walking = shift(distance_county$Average.number.of.jobs.reachable..br...within.30.minutes.by.public.transit.or.walking, 1))  

distance_county <- distance_county %>%
  filter(distance_county$row.names != "Average number of jobs that can be reached from each block within 30 minutes")

distance_county <- distance_county[-c(1,2,10, 11,12,13,14,15,16,17,18,19,20,21),]

names(distance_county) <- c("county", "within30")

cleanwithin30 <- c(4441, 1908, 10897, 51618, 47213, 469, 1703)

distance_county <-
  distance_county %>%
  mutate(within30=cleanwithin30)

distance_county

##       county within30
## 1      Anoka     4441
## 2     Carver     1908
## 3     Dakota    10897
## 4   Hennepin    51618
## 5     Ramsey    47213
## 6      Scott      469
## 7 Washington     1703

With the above figures, we collected data from the Census for the corresponding regions to see how many people actually commuted to work via public transit, walking, or cycling.

# constructing a data frame

# the county names
county <- c('Anoka', 'Carver', 'Dakota', 'Hennepin', 'Ramsey', 'Scott', 'Washington')

# the number of people in each county who uses public transportation to work
people <- c(5858, 1103, 7465, 49100, 19162, 1198, 3273)

# adding the average number of jobs that can be reached from each block within 30 minutes
commute_county <- data.frame(county, people, cleanwithin30)
names(commute_county) <- c("County", "Commute by Public Transport/Walking/Biking", "Has a Job Accessible within 30 Minutes by Public Transport/Walking/Biking")

commute_county

##       County Commute by Public Transport/Walking/Biking
## 1      Anoka                                       5858
## 2     Carver                                       1103
## 3     Dakota                                       7465
## 4   Hennepin                                      49100
## 5     Ramsey                                      19162
## 6      Scott                                       1198
## 7 Washington                                       3273
##   Has a Job Accessible within 30 Minutes by Public Transport/Walking/Biking
## 1                                                                      4441
## 2                                                                      1908
## 3                                                                     10897
## 4                                                                     51618
## 5                                                                     47213
## 6                                                                       469
## 7                                                                      1703

Then, we compared these two figures for each county to see whether people were commuting to work by public transit, walking, or cycling when their workplace was accessible.

# constructing the visualization
commute.long<-melt(commute_county)

ggplot(commute.long, aes(County, value, fill=variable)) +
  geom_bar(stat="identity", position="dodge") +
  labs(x="County", y="Number of People", fill="Variable") +
  theme(axis.title=element_text(size=10)) + 
  theme(axis.text.x=element_text(size=7)) +
  theme(axis.text.y=element_text(size=8)) +
  theme(legend.title=element_text(size=8)) +
  theme(legend.text=element_text(size=7))

Here, we observed that:

With the exception of Ramsey county, these two numbers for each county did not differ too much from each other.
Three out of four counties, Dakota, Hennepin, and Ramsey, had more people with an accessible workplace than the number of people actually commuting by these methods.
Hennepin and Ransey county had significantly more traffic than the other five counties.

Consequently, we narrowed down our research to Hennepin and Ramsey county, specifically Minneapolis and Saint Paul, because with more data, we believed that we could better analyze how the population here interacts with MetroTransit. At the same time, as jobs and income are closely related, we began to investigate whether differences in incomes between areas would lead to a change in ridership.

# gathering and constructing the necessary data

# getting data on all zip codes and filter those for Minnesota
zip_code_database <- read_csv("free-zipcode-database.csv")
zip_code_database_mn <-
  zip_code_database %>%
  filter(State == "MN")

# filtering the zip codes in Saint Paul and Minneapolis and cleaning up the data
zip_code_saint_paul <-
  zip_code_database_mn %>%
  filter(City %in% c("SAINT PAUL", "W SAINT PAUL", "WEST SAINT PAUL", "N SAINT PAUL", "NORTH SAINT PAUL", "SO SAINT PAUL"), ZipCodeType == "STANDARD")

zip_code_minneapolis <-
  zip_code_database_mn %>%
  filter(City == "MINNEAPOLIS", ZipCodeType == "STANDARD")

# constructing a vector for the median income per household by zip codes in Saint Paul and Minneapolis
income_saint_paul <- c(31155, 34756, 29558, 38237, 53879, 36560, 39552, 39552, 39552, 43277, 50614, 50614, 50614, 59373, NA, 54365, 50616, 26895, 72877, 46863, 40132, 50325, 50325, 50325, 45666, 73901, 54358, 60298, 82004, 70253, 73754, 68941, 75091, 55632, 93218, NA, NA)

income_minneapolis <- c(40716, 30391, 30702, 20923, 40368, 40867, 37462, 34216, 55424,64084, 28434, 38818, 33774, 28426, 30156, 55252, 52127, 41344, 63513,47295,39526, 51885, 45598, 93481, 44538, 47145, 55685, 43970, 43511, 39620, 57197, 48370, 52128, 54887, 42487, 72361, 58483, 70011, 87302, 63068, 84451, 62347, 65043, 63662, 91431, 75029, 60226, 67617, NA, 14360, 66250)

# adding the vectors to the datasets and merging the datasets into a single data frame
zip_code_saint_paul <-
  zip_code_saint_paul %>%
  mutate(average_income_per_household=income_saint_paul)

zip_code_minneapolis <-
  zip_code_minneapolis %>%
  mutate(average_income_per_household=income_minneapolis)

income_data <- rbind(zip_code_minneapolis, zip_code_saint_paul)
head(income_data)

## # A tibble: 6 x 21
##   RecordNumber Zipcode ZipCodeType City  State LocationType   Lat  Long
##          <dbl> <chr>   <chr>       <chr> <chr> <chr>        <dbl> <dbl>
## 1        29401 55401   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## 2        29402 55402   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## 3        29403 55403   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## 4        29404 55404   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## 5        29405 55405   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## 6        29406 55406   STANDARD    MINN~ MN    PRIMARY       45.0 -93.3
## # ... with 13 more variables: Xaxis <dbl>, Yaxis <dbl>, Zaxis <dbl>,
## #   WorldRegion <lgl>, Country <chr>, LocationText <chr>, Location <chr>,
## #   Decommisioned <lgl>, TaxReturnsFiled <dbl>, EstimatedPopulation <dbl>,
## #   TotalWages <dbl>, Notes <chr>, average_income_per_household <dbl>

# getting the shapes for each zip code in Minnesota
suppressMessages(mn_zips<- zctas(state="mn"))

# joining the shapes to each zip code from the income data
mn_zips_joined <- 
  geo_join(mn_zips, income_data, by_sp="ZCTA5CE10", by_df="Zipcode", how="inner")

As MetroTransit provides coverage throughout the entire area, we directed our research towards areas with bus stops that are less frequently used than others.

# filtering the stops that had less than 10 boardings in 2016
under_used <- stop_usage %>% 
  filter(calendar_date>=as.Date("2016-01-01")&calendar_date<as.Date("2016-12-31")) %>%
  group_by(site_id) %>% summarise(count=sum(daily_boardings)) %>% 
  filter(count<10) %>%
  inner_join(stops,by=c("site_id")) %>% select(site_id,site_latitude,site_longitude,count)

# constructing the visualization
leaflet(under_used) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addCircleMarkers(data=under_used, ~site_longitude, ~site_latitude, stroke=FALSE, radius=5)

# constructing the visualization
pal <- colorBin("OrRd", domain=mn_zips_joined$average_income_per_household, 4)

leaflet(mn_zips_joined) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addPolygons(fillColor = ~pal(average_income_per_household),
              weight = 2, 
              opacity = 1,
              color = "white",
              dashArray = "3",
              fillOpacity = 0.7, highlight = highlightOptions(weight = 5, 
                                                              color = "#666", 
                                                              dashArray = "", 
                                                              fillOpacity = 0.7, 
                                                              bringToFront = TRUE)) %>%
  addLegend(pal = pal, 
            values = ~average_income_per_household, 
            opacity = 0.7, 
            position = "bottomright", 
            title="Income",
            labFormat = labelFormat(prefix = "$"))

When we visualized the two datasets, we observed a similar spatial pattern so we decided to combine the bus stops and the income maps to better compare the trend.

# constructing the visualization
pal <- colorBin("OrRd", domain = mn_zips_joined$average_income_per_household, 4)

leaflet(mn_zips_joined) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addPolygons(fillColor = ~pal(average_income_per_household),
              weight = 2, 
              opacity = 1,
              color = "white",
              dashArray = "3",
              fillOpacity = 0.7, highlight = highlightOptions(weight = 5, 
                                                              color = "#666", 
                                                              dashArray = "", 
                                                              fillOpacity = 0.7, 
                                                              bringToFront = TRUE)) %>%
  addLegend(pal = pal, 
            values = ~average_income_per_household, 
            opacity = 0.7, 
            position = "bottomright", 
            title="Income",
            labFormat = labelFormat(prefix = "$")) %>%
  addMarkers(data=under_used,~site_longitude,~site_latitude,clusterOptions=markerClusterOptions())

We observed that:

Regarding median income, the figure is lower for areas in the center of each cities and it is higher as we move further away from the center.
Similarly, regarding bus stop boardings, the center of both cities had very little stops that were frequented less than 10 times in 2016. However, the outskirts of the cities had a lot of stops that were underused.

Research Question 2: How do college students influence transit?

In order to begin our research, we made use of the Google Places API to get the locations of several colleges in the Twin Cities area.

# importing retrieved API information to a dataset

# Note, this was originally constructed using Google Places API, which has since changed # its terms of use and was no longer feasible to use for this project.

college_locations <- data.frame(
  place = c("University of Northwester St. Paul","Hamline University","Macalester College","Bell Museum of Natural History", "Walden University"),
  coords.lon = c(-93.16857,-93.16481,-93.16873,-93.23340,-93.26569),
  coords.lat = c(45.03156,44.96597,44.93976,44.97783,44.98129)
  )

head(college_locations)

##                                place coords.lon coords.lat
## 1 University of Northwester St. Paul  -93.16857   45.03156
## 2                 Hamline University  -93.16481   44.96597
## 3                 Macalester College  -93.16873   44.93976
## 4     Bell Museum of Natural History  -93.23340   44.97783
## 5                  Walden University  -93.26569   44.98129

Using the coordinates we retrieved from Google Places, we created a visualization of the average daily ridership over the course of 2016 in two areas - one around Macalester, and one around West Saint Paul. We chose these areas because the Macalester area has colleges around, whereas West Saint Paul does not. Both areas are of similar income levels.

# getting the average daily ridership around Macalester for every day in the year 2016
macalester_ridership <-
  stop_usage %>%
  filter(year(calendar_date)==2016) %>%
  inner_join(stops,by="site_id") %>%
  select(site_id, calendar_date, site_latitude, site_longitude, daily_boardings) %>%
  filter(site_latitude >= 44.92 & 
           site_latitude <= 44.96 & 
           site_longitude >= -93.19 &
           site_longitude <= -93.15) %>%
  group_by(calendar_date) %>%
  summarise(ridership=mean(daily_boardings)) %>%
  mutate(around_mac=TRUE)

# getting the average daily ridership for an area of West Saint Paul for every day in the year 2016
w_st_paul_ridership <-
  stop_usage %>%
  filter(year(calendar_date)==2016) %>%
  inner_join(stops,by="site_id") %>%
  select(site_id, calendar_date, site_latitude, site_longitude, daily_boardings) %>%
  filter(site_latitude >= 44.9 &
           site_latitude <= 44.93 &
           site_longitude >= -93.12 &
           site_longitude <= -93.08) %>%
  group_by(calendar_date) %>%
  summarise(ridership=mean(daily_boardings)) %>%
  mutate(around_mac=FALSE)

# combining the data sets - around_mac=TRUE indicates ridership at stops near Macalester
combined_ridership <-
  rbind(macalester_ridership, w_st_paul_ridership)

head(combined_ridership)

## # A tibble: 6 x 3
##   calendar_date       ridership around_mac
##   <dttm>                  <dbl> <lgl>     
## 1 2016-01-01 06:00:00      9.99 TRUE      
## 2 2016-01-02 06:00:00     14.3  TRUE      
## 3 2016-01-03 06:00:00      9.82 TRUE      
## 4 2016-01-04 06:00:00     16.7  TRUE      
## 5 2016-01-05 06:00:00     15.4  TRUE      
## 6 2016-01-06 06:00:00     14.1  TRUE

# creating a scatterplot on the average daily ridership over time - colored by location (around_mac)
ggplot(combined_ridership,aes(x=calendar_date, y=ridership, color=around_mac)) + 
  geom_point() +
  geom_smooth() +
  labs(x="Time", y="Ridership", color="Area around Macalester College")

We observed that what is important here is not so much the overall ridership numbers as the curvature of the trend lines:

There is a slight dip during the summer near Macalester, then ridership began to increase starting from the State Fair, going all the way up towards the time around winter break.
This pattern is much less pronounced in West Saint Paul, where there are no colleges nearby.

Next, we decided to take a different approach to visualizing this trend - a map. We processed the data to get the average school-year (winter) ridership and average summer ridership at each stop. We then took the ratio of summer to winter ridership, and colored each stop on the map accordingly. Purple colors on the map indicate a smaller ratio and therefore more wintertime ridership, whereas green represents a larger ratio and therefore higher summer ridership.

# creating dataset of average summer and winter ridership for each stop around Macalester
mac_seasonal_ridership <-
  stop_usage %>%
  filter(year(calendar_date)==2016) %>%
  mutate(summer=month(calendar_date) %in% c(6,7,8,9)) %>%
  group_by(site_id, summer) %>%
  summarise(boardings=mean(daily_boardings)) %>%
  inner_join(stops, by="site_id") %>%
  select(site_id, summer, boardings, site_latitude, site_longitude) %>%
  filter(site_latitude >= 44.92 &
           site_latitude <= 44.96 &
           site_longitude >= -93.19 &
           site_longitude <= -93.15) %>%
  spread(summer,boardings) %>%
  mutate(winter=`FALSE`, summer=`TRUE`) %>%
  select(site_id, winter, summer, site_latitude, site_longitude) %>%
  mutate(diff=summer/winter)

head(mac_seasonal_ridership)

## # A tibble: 6 x 6
## # Groups:   site_id [6]
##   site_id winter summer site_latitude site_longitude  diff
##     <dbl>  <dbl>  <dbl>         <dbl>          <dbl> <dbl>
## 1    3083  0.861  0.869          44.9          -93.2 1.01 
## 2    3084 13.2   11.8            44.9          -93.2 0.892
## 3    3085 10.8   10.1            44.9          -93.2 0.932
## 4    3086  2.20   1.25           44.9          -93.2 0.571
## 5    3087  2.45   2.23           44.9          -93.2 0.911
## 6    3088  1.43   0.836          44.9          -93.2 0.586

# defining a color bin to color the map by
pal2 <- colorBin("PRGn", domain=mac_seasonal_ridership$diff, 6)

# creating the visualization
leaflet(mac_seasonal_ridership) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addCircleMarkers(~site_longitude, ~site_latitude, color= ~pal2(diff)) %>%
  addMarkers(data=college_locations, ~coords.lon, ~coords.lat, popup = ~htmlEscape(place)) %>%
  addLegend(pal=pal2, 
            values=~diff, 
            opacity=0.7, 
            title="Purple=Winter, Green=Summer", 
            position="bottomright") %>%
  setView(-93.16873,44.93976,13)

We noticed that:

There is a cluster of purple circles around the colleges (especially Macalester College) and throughout Snelling Avenue, suggesting an increase in the usage of public transit during school time, which we believe might be influenced by college students.
In the northern area around Snelling and near University Avenue, the same phenomenon is observed. In this area there are strip malls with a variety of retail and fast food chains as well as access to the light rail and University Avenue which makes us believe that the purple circles could again be caused by college students going to these locations when school is in session to shop, eat or access the light rail.

This image of the Macalester area does not mean much unless we compare it to other regions. First, we looked at West Saint Paul, an area where there are not any colleges around.

# creating a dataset for the average summer and winter ridership for each stop in West Saint Paul
w_st_paul_seasonal_ridership <-
  stop_usage %>%
  filter(year(calendar_date) == 2016) %>%
  mutate(summer=month(calendar_date) %in% c(6, 7, 8, 9)) %>%
  group_by(site_id, summer) %>%
  summarise(boardings=mean(daily_boardings)) %>%
  inner_join(stops, by="site_id") %>%
  select(site_id, summer, boardings, site_latitude, site_longitude) %>%
  filter(site_latitude >= 44.9 &
           site_latitude <= 44.93 &
           site_longitude >= -93.12 &
           site_longitude <= -93.08) %>%
  spread(summer,boardings) %>%
  mutate(winter=`FALSE`, summer=`TRUE`) %>%
  select(site_id, winter, summer, site_latitude, site_longitude) %>%
  mutate(diff=summer/winter)

#creating the visualization
pal4 <- colorBin("PRGn", domain = w_st_paul_seasonal_ridership$diff, 5)

leaflet(w_st_paul_seasonal_ridership) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addCircleMarkers(~site_longitude, ~site_latitude, color= ~pal4(diff)) %>%
  addMarkers(data=college_locations, ~coords.lon, ~coords.lat, popup = ~htmlEscape(place)) %>%
  addLegend(pal=pal4, 
            values=~diff, 
            opacity=0.7, 
            title="Purple=Winter, Green=Summer", 
            position="bottomright") %>%
  setView(-93.1016,44.9161,13)

The overall trend here is that ridership is skewed towards the school year, but compared to the area where there are colleges and univerities, we are seeing much lighter colors and even some green ones indicative of higher summer ridership.

Next we looked at Roseville, in the region south of the University of Northwestern Saint Paul.

# creating the dataset for the average summer and winter ridership for each stop in Roseville
roseville_seasonal_ridership <-
  stop_usage %>%
  filter(year(calendar_date)==2016) %>%
  mutate(summer=month(calendar_date) %in% c(6, 7, 8, 9)) %>%
  group_by(site_id, summer) %>%
  summarise(boardings=mean(daily_boardings)) %>%
  inner_join(stops, by="site_id") %>%
  select(site_id, summer, boardings, site_latitude, site_longitude) %>%
  filter(site_latitude >= 44.99 &
           site_latitude <= 45.03 &
           site_longitude >= -93.18 &
           site_longitude <= -93.13) %>%
  spread(summer, boardings) %>%
  mutate(winter=`FALSE`, summer=`TRUE`) %>%
  select(site_id, winter, summer, site_latitude, site_longitude) %>%
  mutate(diff=summer/winter)

# creating the visualization
pal3 <- colorBin("PRGn", domain=roseville_seasonal_ridership$diff, 5)

leaflet(roseville_seasonal_ridership) %>%
  addTiles('http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>%
  addProviderTiles(provider=providers$CartoDB.Positron) %>%
  addCircleMarkers(~site_longitude, ~site_latitude, color= ~pal3(diff)) %>%
  addMarkers(data=college_locations, ~coords.lon, ~coords.lat, popup=~htmlEscape(place)) %>%
  addLegend(pal=pal3, 
            values=~diff, 
            opacity=0.7, 
            title="Purple=Winter,Green=Summer", 
            position="bottomright") %>%
  setView(-93.1566,45.0061,13)

We can see that the stops closest to the university are used a lot during the school year, while others have a more even usage pattern. It is notable that the University of Northwestern Saint Paul’s closest bus stop is relatively far from the campus even though they are likely users of the transit system. This might be one of the few areas the MetroTransit needs to add a stop to.

Conclusions

In the course of our research, we kept in mind several factors that might be external influences to how people use public transit but some of the observations that we drew from our research are:

Higher income areas tend to use public transport less than lower income areas.
There is an increase in bus stop boardings surrounding colleges during the school period.