The idea of this guide is to show a way to use the package datazoom.amazonia. For example, we’ll use the PPM’s data bases (obtained from IBGE) and from Mapbiomas (Observatório do Clima) in order for us to analyze the amount of cattle heads per hectare of pasture area. In order for us to be able to do this analysis, we can use the functions load_ppm() e load_mapbiomas(), which are included in the package, that import the data directly from the source to our RStudio.

In order to start, it’ll be necessary to install the package datazoom.amazonia (through Github), in case it has not been downloaded before, and upload it. Besides this, we’ll use the package tidyverse for data manipulation.

# install.packages("devtools")
# devtools::install_github("datazoompuc/datazoom.amazonia")
library(datazoom.amazonia)

#install.packages("tidyverse")
library(tidyverse)

The package’s functions are shown with the command help(package = "datazoom.amazonia"). Generally speaking, they follow the pattern load_* followed by the data base’s name.

LOADING DATA

We’ll start loading the following data base from MAPBIOMAS, using the function load_mapbiomas.

data_frame_mapbiomas <- datazoom.amazonia::load_mapbiomas(dataset = "mapbiomas_cover",
                                                         cover_level = "4",
                                                         geo_level = "municipality",
                                                         raw_data = FALSE)

data_frame_mapbiomas <- data_frame_mapbiomas%>%
  filter(year==2019)

Generally, every one of this package’s function follow this pattern.

Besides that, we can charge the dataset livestock_inventory from ppm.

data_frame_ppm <- load_ppm(dataset = "ppm_livestock_inventory",
                           time_period = 2019,
                           geo_level = "municipality",
                           language = "pt",
                           raw_data = FALSE)
## 1 in 27 states...
## 2 in 27 states...
## 3 in 27 states...
## 4 in 27 states...
## 5 in 27 states...
## 6 in 27 states...
## 7 in 27 states...
## 8 in 27 states...
## 9 in 27 states...
## 10 in 27 states...
## 11 in 27 states...
## 12 in 27 states...
## 13 in 27 states...
## 14 in 27 states...
## 15 in 27 states...
## 16 in 27 states...
## 17 in 27 states...
## 18 in 27 states...
## 19 in 27 states...
## 20 in 27 states...
## 21 in 27 states...
## 22 in 27 states...
## 23 in 27 states...
## 24 in 27 states...
## 25 in 27 states...
## 26 in 27 states...
## 27 in 27 states...
## Download Succesfully Completed!

Generally, every one of this package’s function follow this pattern.

Below, we can see the first 5 columns from the resulting table mapbiomas and, following this one, the table from ppm:

year state municipality municipality_code forest_formation
2019 RO Alta Floresta D’Oeste 1100015 362622.39
2019 RO Ariquemes 1100023 135556.17
2019 RO Cabixi 1100031 35705.03
2019 RO Cacoal 1100049 136779.98
2019 RO Cerejeiras 1100056 76447.99
2019 RO Colorado do Oeste 1100064 28187.34
geo_id ano num_v2670 num_v2675 num_v2672
1100015 2019 428976 172 5456
1100023 2019 477665 539 7140
1100031 2019 121798 15 1690
1100049 2019 432640 111 5907
1100056 2019 89884 7 1009
1100064 2019 255696 123 4023

DATA WRANGLING

Next, we’ll do the merge of the data bases, and, afterwards, a simple data manipulation in order for us to generate the targeted variable: (cattle heads) / (pasture area hectares)

class(data_frame_mapbiomas$municipality_code ) <- 'numeric'
class(data_frame_mapbiomas$year) <- 'numeric'

class(data_frame_ppm$geo_id) <- 'numeric'
class(data_frame_ppm$ano) <- 'numeric'

merge <- data_frame_mapbiomas %>%
  full_join(data_frame_ppm, 
            by = c('municipality_code' = 'geo_id', 'year' = 'ano')) 

Afterwards, the comand that generates the variable of interest

data <- merge %>%
  mutate(n_bovino_por_area_pastagem = num_v2670/pasture) 

After that, we select the desired variables for the final data base.

data <- data %>% 
  select(municipality_code, municipality, state, year, n_bovino_por_area_pastagem) %>%
  arrange(municipality_code, year) %>%
  relocate(municipality_code, year)

APPLICATIONS

  1. We’ll now select the Midwest Region states (Centro-Oeste, in Portuguese) for 2019. Following, we aggregate the variables by State, in order for us to calculate the cattle heads by hectare average ration for each one of them.
data <- data%>%
  filter(state %in%c( "GO" , "MT" , "MS" , "DF"))

data <- data%>%
  group_by(state)%>%
  summarise(average= mean(n_bovino_por_area_pastagem, na.rm= TRUE))

We can use the ggplot2, contained in the tidyverse, to visualize the data in a bar plot.

ggplot(data, aes(x = state, y = average)) +
  geom_col(fill = "#4fdf94", colour = "#00596d")+
  xlab("Estado")+
  ylab("Average cattle heads by hectare area")

At the Data Zoom Amazônia website, you can check other visualizations about other brazillians researche’s data, as well as the data bases covered by our package.

If you need any help or find any problems within the package, please contact us through Github.