CSDS313 Assignment1-Introduction to Data Analysis Solved

35.00 $

Category:

Description

Rate this product

Use the attached COVID dataset to answer the following questions. The dataset contains information regarding COVID cases and deaths across time and countries. Information such as the population of the country is also provided.

  • How many rows and columns are in the dataset?
  • How many different countries are in the dataset? What is the earliest date recorded. What is the latest date recorded?
  • Determine the mean and standard deviation of the population of countries. Plot the distribution of population size as a histogram. Describe the distribution.
  • Determine the median and interquartile range of the cases reported across all countries on May 4th, 2020. Plot this distribution as a histogram. Describe the distribution.
  • Which country had the greatest increase in the number of cases from June 1st to July 1st

(inclusive)?

  • Which country had the greatest increase in average cases per day from June 1st to July 1st

(inclusive)?

  • Which country had the greatest increase in average cases per 10,000 people per day from June 1st to July 1st (inclusive)?
  • Find which day the world had the greatest number of reported cases. Find which day the world had the greatest number of reported deaths.
  • Create a time series plot of the number of daily reported cases and number of deaths for the United States from July 1st to August 1st (inclusive). Use the date on the x-axis. What trend, if any, do you notice? Give a potential explanation for this trend or lack thereof.
  • Which country in South America had the greatest percent of their population become infected with the coronavirus in April?
  • Which continent had the most cases? Which continent has the most cases per 10,000 people?
  • assignment_1-hres8v.zip