代做统计问题集 STA130H1F代写 STA课业代写统计作业代写

2023/01/31R语言代写统计代写 STA130H1F代写 STA课业代写代做统计问题集统计作业代写500

STA130H1F – Week 10 Problem Set

代做统计问题集 Instructions How do I hand in these problems for the 11:59 a.m. ET, November 27th deadline? Your complete .Rmd file that you create for this

Instructions

How do I hand in these problems for the 11:59 a.m. ET, November 27th dead-line?

Your complete .Rmd file that you create for this problem set AND the resulting .pdf (i.e., the one you ‘Knit to PDF’ from your .Rmd file) must be uploaded into a Quercus assignment (link:https://q.utoronto.ca/courses/184002/assignments/465532) by 11:59 a.m. ET, on November 27th. Late problem sets or problems submitted another way (e.g., by email) are not accepted.

Problem set grading

There are two parts to your problem set. One is largely R-based with short written answers and the other is more focused on writing. We recommend you use a word processing software like Microsoft Word to check for grammar errors in your written work. Note: there can be issues copying from Word to R Markdown so it may be easier to write in this file first and then copy the text to Word. Then you can make any changes flagged in Word directly in this file.

Part 1 代做统计问题集

Question 1

Using data from the Gallup World Poll (and the World Happiness Report), we are interested in predicting which factors influence life expectancy around the world. These data are in the file happinessdata_2017.csv.

happiness2017 <- read_csv("happinessdata_2017.csv")

(a) Begin by creating a new variable called life_exp_category which takes the value “Good” for countries with a life expectancy higher than 65 years, and “Poor” otherwise.

(b) Divide the data into training (80%) and testing (20%) datasets. Build a classification tree using the training data to predict which countries have Good vs Poor life expectancy, using only the social_support variable as a predictor. Use the last 3 digits of your student ID number for the random seed.

(c) Use the same training dataset created in (b) to build a second classification tree to predict which countries have good vs poor life expectancy, using logGDP, social_support, freedom, and generosity as potential predictors.

(d) Use the testing dataset you created in (b) to calculate the confusion matrix for the trees you built in (b) and (c). Report the sensitivity (true positive rate), specificity (true negative rate) and accuracy for each of the trees. Here you will treat “Good” life expectancy as a positive response/prediction.

Question 2

Two classification trees were built to predict which individuals have a disease using different sets of potential predictors. We use each of these trees to predict disease status for 100 new individuals. Below are confusion matrices corresponding to these two classification trees.

a) Calculate the accuracy, false-positive rate, and false negative rate for each classification tree. Here, a “positive” result means we predict an individual has the disease and a “negative” result means we predict they do not.

Tree A

• Overall accuracy:

• False-positive rate:

• False-negative rate:

Tree B

• Overall accuracy:

• False-positive rate:

• False-negative rate:

b) Suppose the disease is very serious if untreated. Explain which classififier you would prefer to use.

Question 3

Data was collected on 30 cancer patients to investigate the effectivness (Yes/No) of a treatment. Two quantitative variables, x_{_i} ∈ (0, 1), i = 1, 2, are considered to be important predictors of effectiveness. Suppose that the rectangles labelled as nodes in the scatterplot below represent nodes of a classification tree.

Part 2 代做统计问题集

You are working at a news station and your boss comes running into your office and asks if you had seen the most recent report. It turns out the lost city of Atlantis has been found! The first thing they do is start to collect information on the city.

The country completed the Gallup World Poll and the following data is collected:

For some reason, in the process of collecting the data the life expectancy variable was missed. Now everyone is curious, is the life expectancy of the citizens good or poor? Given your recent work on the Gallup World Poll and the World Happiness Report, your boss is looking to you to provide some insights. Your boss has asked you to prepare a report with the following information:

• A little bit of background about the data you have used

• A description of the methods. Include at least 2 vocabulary words. Remember to explain in lay terms (i.e. explain it in a way that somebody not familiar with our fifield and specialized language can understand).

• The results

• Include at least one figure, and ensure that it is properly labeled (e.g. it should have informative titles and axes)

• A conclusion that provides a key take-home message

• Any limitations in your findings

Other things to consider:

• Try to not spend more than 20 minutes on the prompt.

• Aim for more than 200 but less than 500 words.

• Use full sentences.

• Grammar is not the main focus of this assessment, but it is important that you communicate in a clear and professional manner (i.e., no slang or emojis should appear).

Vocabulary – Classification – Prediction – Predictor(s) – Covariate(s) – Independent variable(s) – Dependent variable(s) – Input(s) – Output(s) – Training set/sample – Testing set/sample – Fitting a model – Confusion

matrix – Category – Tree – Terminal node – Stopping rule – Threshold – True positive (sensitivity) – True negative (specificity) – False positive – False negative – Accuracy – Classifier – Node(s) – Terminal Node – Binary – Split(ting)

合作平台：essay代写论文代写写手招聘英国留学生代写

The prev: 应用回归分析作业代写 MATH5806代写数学作业代写 R代写The next: 代做留学统计作业 STAT 404代写统计代写 R语言作业代写

Related recommendations

统计计算方法代写 R语言代写统计作业代写 Bayesian代写
2022/07/21 1415
HW 10 统计计算方法代写 Question (7 pts) Recall the Beta distribution, which is defined for θ ∈ (0, 1) with parameters α and β, has a density proportional to: Question (7 pts) 统计计算方法代...
View details
统计数据分析作业代写 Statistics代写统计作业代写数据分析代写
2022/07/22 1077
Statistics 统计数据分析作业代写 Background: Exoplanets are planets which orbit other stars, like the Earth orbits the Sun. Exoplanet discovery is currently an exciting and Background: 统计...
View details
数学统计作业代写数学作业代写统计作业代写数学代写
2022/07/30 705
Homework 3 数学统计作业代写 Instructions: Solve the problems in the spaces provided and save as a single PDF. Then upload the PDF to Canvas Assignments by the due date. Instructions: Solve t...
View details
统计分析作业代写 R语言作业代写统计作业代写 R作业代写
2023/06/08 583
Vaccination and severe COVID in Israel 统计分析作业代写 Background mRNA vaccines from Pfizer-BioNTech and Moderna received authorization for emergency use in the U.S. after being found to be ...
View details