STAT823 Homework 3-Data Cleaning and Management Solved

30.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

Rate this product
  1. (a) Clean up the workspace using the rm() function. Use the data() function to display the built-in datasets you can access. Use the R help to learn more about the ‘longley’ dataset: ?longley.
    • Print only the records in the ‘longley’ dataset that are from the years 1947-1950: longley[longley$Year==1947:1950,]. attach(longley).
  • You track your commute times for two weeks and record the following (in minutes):17
    plot(Unemployed Year).
  • Change the type of plot to a line: plot(Unemployed Year, type =”l”)

16 20 24 22 15 21 15 17 22.

  • Enter these numbers into R and find the 5-number summary.
  • You find a data entry error, the number 24 should have been 18. Using R, replace the incorrect value without reentering the entire set of data and find the new 5-number summary.
  • Use R to count the number of times your commute was at least 20 minutes.
  • Use R to calculate the percent of your commutes that were less than 17 minutes.
  1. Using the maltreat.dta dataset, explore the variable ethnic using tab1(ethnic). There are spelling mistakes that need to be corrected. Correct mis-spelt names, and create a numeric, categorical variable ethncity. The “Jola” cleaning code for part (i) has been provided. Finish the remaining part of the code and produce the final (clean) bar chart.
    • Replace ethnic = “Jola” if ethnic value starts with a “J”.
    • Replace ethnic = “Mandinka” if ethnic value starts with an “M”
    • Replace ethnic = “Serahule” if ethnic value starts with an “S”
  • Replace ethnic = “Wollof” if ethnic value starts with a “W”
library(“readstata13”)

maltreat <- read.dta13(“data/maltreat.dta”) # Original ethnic (string) variable tab1(maltreat$ethnic, col = “grey”)

# convert it to a new factor variable ethnicity maltreat$ethnicity <- as.factor(maltreat$ethnic)

# explore the levels (unclean) levels(maltreat$ethnicity)

# clean up for Jola

levels(maltreat$ethnicity)[startsWith(levels(maltreat$ethnicity),

“J”)] <- “Jola”

Distribution of maltreat$ethnic

Frequency

  • HW3-y1lyva.zip