RVA Pets

PUBLISHED ON APR 23, 2020

I recently stumbled across the RVA Open Data Portal and, when browsing through the datasets available, noticed they had one on pet licenses issued by the city. Since I’m a huge dog fan & love our pitty Nala more than most people in my life, I figured I’d splash around in the data a little bit to see what I can learn about pets in RVA. You can get the data here, although note that the most recent data is from April 2019.

First, let’s load our packages and set our plot themes/colors

knitr::opts_chunk$set(echo = TRUE, error = FALSE, warning = FALSE, message = FALSE)

library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.0     v purrr   0.3.4
## v tibble  3.0.1     v dplyr   0.8.5
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## -- Conflicts ------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(osmdata)
## Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sf)
## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(extrafont)
## Registering fonts with R
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(hrbrthemes)
library(wesanderson)
library(tidytext)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(ggtext)

theme_set(theme_ipsum())

pal <- wes_palette("Zissou1")

colors <- c("Dog" = pal[1], "Cat" = pal[3])

Next, we’ll read in the data and clean it up a little bit. In this dataset, each row represents a licensed pet in Richmond, Virginia. The dataset includes animal type (dog, cat, puppy, kitten) and the address of the owners. Whoever set up the data was also nice enough to include longitude and latitude for each address in the dataset, which means I don’t need to go out and get it. For our purposes here, I’m going to lump puppies in with dogs and kittens in with cats. I’m also going to extract the “location” column into a few separate columns. Let’s take a look at the first few entries.

pets_raw <- read_csv("C:/Users/erice/Documents/Data/Visualizations/RVA-VA Data/Data/rva_pets_2019.csv")

pets_clean <- pets_raw %>%
  clean_names() %>%
  extract(col = location_1,
          into = c("address", "zip", "lat", "long"),
          regex = "(.*)\n.*(\\d{5})\n\\((.*), (.*)\\)") %>%
  mutate(animal_type = str_replace_all(animal_type, c("Puppy" = "Dog", "Kitten" = "Cat")))

head(pets_clean) %>%
  kable(format = "html") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"))
animal_type animal_name address zip lat long load_date
Dog Abbey 3406 Gloucester Road 23227 37.579148 -77.456489 20180627
Dog Fiby 330 Lexington Road 23226 37.570357 -77.504806 20180627
Dog Lemmy 3130 Griffin Avenue 23222 37.574964 -77.437206 20180627
Dog Clementine 3503 Park Avenue APT 1/2 23221 37.563474 -77.482083 20180627
Dog Monte 3317A Park Avenue 23221 37.562448 -77.479608 20180627
Dog Kelsey 3112 Ellwood Avenue 23221 37.554311 -77.480057 20180627

Ok, now that our data is set up, let’s see if there are more cats or dogs in the city.

pets_clean %>%
  count(animal_type) %>%
  ggplot(aes(x = n, y = animal_type)) +
  geom_col(color = pal[1], fill = pal[1]) +
  geom_text(aes(x = n-50, label = n), hjust = 1, color = "white", fontface = "bold") +
  labs(
    title = "Number of Cats vs Dogs"
  )

Alright, so, lots more dogs. Like almost 4 to 1 dogs to cats. Which is something I can get behind. I’m a firm believer in the fact that dogs are wayyy better than cats.

I’m also interested in the most common names for pets in RVA.

pets_clean %>%
  group_by(animal_type) %>%
  count(animal_name, sort = TRUE) %>%
  slice(1:15) %>%
  ungroup() %>%
  ggplot(aes(x = n, y = reorder_within(animal_name, n, animal_type))) +
    geom_col(color = pal[1], fill = pal[1]) +
    geom_text(aes(x = if_else(animal_type == "Cat", n - .25, n - 1), label = n), hjust = 1, color = "white", fontface = "bold") +
    facet_wrap(~animal_type, scales = "free") +
    scale_y_reordered() +
    labs(
      title = "Top Pet Names",
      y = NULL
    )

These seem pretty standard to me, and unfortunately, nothing is screaming “RVA” here. No “Bagels,” no “Gwars,” etc.

I also pulled out zip codes into their own column earlier, so we can take a look at which zip codes have the most dogs and cats.

pets_clean %>%
  filter(!is.na(zip)) %>%
  group_by(zip) %>%
  count(animal_type, sort = TRUE)%>%
  ungroup() %>%
  group_by(animal_type) %>%
  top_n(n = 10) %>%
  ungroup() %>%
  ggplot(aes(x = n, y = reorder_within(zip, n, animal_type))) +
    geom_col(color = pal[1], fill = pal[1]) +
    geom_text(aes(x = if_else(animal_type == "Cat", n - 1, n - 4), label = n), hjust = 1, color = "white", fontface = "bold") +
    facet_wrap(~animal_type, scales = "free") +
    scale_y_reordered() +
    labs(
      title = "Number of Pets by Zipcode",
      y = NULL
    )

Alright, so most of the pets here live in Forest Hill/generally south of the river in 23225, and another big chunk live in 23220, which covers a few neighborhoods & includes The Fan, which is probably where most of the pet action is.

And finally, since we have the latitude and longitude, I can put together a streetmap of the city showing where all of these little critters live. To do this, I’m going to grab some shape files through the OpenStreetMaps API and plot the pet datapoints on top of those.

pets_map <- st_as_sf(pets_clean %>%
                       filter(!is.na(long)), coords = c("long", "lat"),
                     crs = 4326)

get_rva_maps <- function(key, value) {
  getbb("Richmond Virginia United States") %>%
    opq() %>%
    add_osm_feature(key = key,
                    value = value) %>%
    osmdata_sf()
}

rva_streets <- get_rva_maps(key = "highway", value = c("motorway", "primary", "secondary", "tertiary"))
small_streets <- get_rva_maps(key = "highway", value = c("residential", "living_street",
                                                         "unclassified",
                                                         "service", "footway", "cycleway"))
river <- get_rva_maps(key = "waterway", value = "river")

df <- tibble(
  type = c("big_streets", "small_streets", "river"),
  lines = map(
    .x = lst(rva_streets, small_streets, river),
    .f = ~pluck(., "osm_lines")
  )
)

coords <- pluck(rva_streets, "bbox")

annotations <- tibble(
  label = c("<span style='color:#FFFFFF'><span style='color:#EBCC2A'>**Cats**</span> and <span style='color:#3B9AB2'>**Dogs**</span> in RVA</span>"),
  x = c(-77.555),
  y = c(37.605),
  hjust = c(0)
)

rva_pets <- ggplot() +
  geom_sf(data = df$lines[[1]],
          inherit.aes = FALSE,
          size = .3,
          alpha = .8, 
          color = "white") +
  #geom_sf(data = df$lines[[2]],
  #        inherit.aes = FALSE,
  #        size = .1,
  #        alpha = .6) +
  geom_sf(data = pets_map, aes(color = animal_type), alpha = .6, size = .75) +
  geom_richtext(data = annotations, aes(x = x, y = y, label = label, hjust = hjust), fill = NA, label.color = NA, 
                label.padding = grid::unit(rep(0, 4), "pt"), size = 11, family = "Bahnschrift") + 
  coord_sf(
    xlim = c(-77.55, -77.4),
    ylim = c(37.5, 37.61),
    expand = TRUE
  ) +
  theme_void() +
  scale_color_manual(
    values = colors
  ) +
  theme(
    legend.position = "none",
    plot.background = element_rect(fill = "grey10"),
    panel.background = element_rect(fill = "grey10"),
    text = element_markdown(family = "Bahnschrift")
  )