DPLYR - Data manipulation in R studio
- Fortune Fornax
- Jun 24, 2020
- 1 min read
DPLYR is a package in R language that uses grammar of data manipulation. It makes access
to information in a dataset easier and faster. It provides with a set of verbs or functions used to transform data. Here, we will discuss some of its key functions using covid 19 dataset:
Examples
filter( )
#To filter the data with information for only one country
library("dplyr")
COVID_19%>% #%>% is pipe operator
filter(location == "Pakistan")

mutate( )
#In order to calculate individuals that are alive we subtracted deaths from cases and created a new column
Alive_data = COVID_19%>%
mutate(Alive_individulas = cases-deaths)
Alive_data

select( )
#Here, we intend to select only three columns from the data i.e location, cases and deaths to form a new dataset
Data_location_cases_deaths = COVID_19%>%
select(location, cases, deaths)
Data_location_cases_deaths

summarize( )
Total_cases = summarize(COVID_19, Total_cases = sum(cases))
Total_cases

group_by( )
Total_deaths = COVID_19%>%
group_by(location)%>%
summarize(Total_deaths = sum(deaths))
Total_deaths

Comments