top of page

DPLYR - Data manipulation in R studio

Writer's picture: Fortune FornaxFortune Fornax

DPLYR is a package in R language that uses grammar of data manipulation. It makes access

to information in a dataset easier and faster. It provides with a set of verbs or functions used to transform data. Here, we will discuss some of its key functions using covid 19 dataset:

  • filter( ) #filters data and picks columns based on values

  • mutate( ) #adds new columns or variables

  • select( ) #selects variables based on their names

  • summarize( ) #reduces values / sum up

  • group_by( ) #used to summarize columns by forming groups


filter( )

#To filter the data with information for only one country
COVID_19%>%        #%>% is pipe operator
filter(location == "Pakistan")    

mutate( )

#In order to calculate individuals that are alive we subtracted deaths from cases and created a new column
Alive_data = COVID_19%>%
mutate(Alive_individulas = cases-deaths)

select( )

#Here, we intend to select only three columns from the data i.e location, cases and deaths to form a new dataset
Data_location_cases_deaths = COVID_19%>%
select(location, cases, deaths)  

summarize( )

Total_cases = summarize(COVID_19, Total_cases = sum(cases))

group_by( )

Total_deaths = COVID_19%>%        
summarize(Total_deaths = sum(deaths))

6 views0 comments

Recent Posts

See All


Post: Blog2_Post
bottom of page