Search the whole station

代考应用统计和数据分析 AMS 553.414/614代写 统计代考

86

Practice questions for final exam

1. 代考应用统计和数据分析

The data for this practice question is based on the cars dataset which is automatically comes with R.

(a) Let dist be the response variable and speed be the explanatory variable. Do quintic polynomial regression (including the interecept). Which individual coefficient has the highest statistical significance?

(b) Using stepwise backwards elimination, continue to drop the least statistically significant regressors (but do not drop the intercept) until all (non-intercept) regressors have p-values of less than 0.05. Which regressors remain?

(c) Now treat the intercept as just another regressor. Using stepwise backwards elimination, continue to drop the least statistically significant regressors (drop the intercept if it is least significant) until all regressors have p-values of less than 0.05. Which regressors remain?

(d) First regress on the intercept only. Then using stepwise forward selection, continue to include the most statistically significant regressors (up to and including the quintic term) until no more additional regressors would have p-values of less than 0.05. Which regressors are selected for the model?

(e) Which two regressors (plus intercept) give the best fit? And which set of regressors gives the best BIC? (Hint: Use the leaps package.)

2. 代考应用统计和数据分析

The data for this practice question is based on the cars dataset which is automatically comes with R. The intercept will is included.

(a) Regress dist on speed. What is the AIC?

(b) Do set.seed(0). Use bootstrapping to create 10,000 more AIC statistics. What is their standard deviation? (Hint: Use dplyr::sample_n to appropriately sample rows from a dataframe.)

(c) Plot a histogram of the AIC’s. Does the distribution look skewed left, or skewed right, or symmetric?

3.

The data for this practice question is based on the Titanic_train.csv which is available in Blackboard. The intercept will always be included.

(a) Logistically regress Survived (the response variable) on the regressors Pclass (treat as cardinal variable), Sex and Age. What is the least signficant regressor?

(b) The difference between the null and residual deviance is distributed as chi-squared with how many degrees of freedom?

(c) Make a box plot of the Pearson residuals versus the texttPclass variable. Hint: If you get a mismatched length error, make an adjustment to the appropriate parameter in your glm call. See R glm() documentation for help.

(d) Find the mean Pearson residual for Pclass = 2. Hint: One method is to regress the Pearson residuals versus Pclass as a categorical variable. Another method is to use the aggregate() function.

4. 代考应用统计和数据分析

The data for this practice question is based on the Titanic_train.csv which is available in Blackboard. The intercept will always be included.

(a) What is the most common value of Embarked?

(b) Do multinomial logistic regression with Embarked as the response variable and Pclass (treat as cardinal variable), Sex, Age and Survived as the regressor. Use the most common value of Embarked as the reference value. For predicting which passengers embarked from France, what is the most signfificant regressor? The least significant regressor?

(c) Is a survivor more or less likely to have embarked from France? How much to the log odds change?

5.

The data for this practice question is based on the Titanic_train.csv which is available in Blackboard. The intercept will always be included.

(a) How many values of Age are missing?

(b) Do sed.seed(0). Using the mice package and the default method, create five imputed datasets. What are the five imputed ages for passenger number 6?

(c) Do sed.seed(0) and repeat the above using the norm.boot method. What are the five imputed ages for passenger number 6?

The prev:

Related recommendations

• 统计考试代写 Statistics代写 linear regression model代写

639

Statistics II Exam - 1st sitting – Duration: 120+30 minutes 统计考试代写 I. (2,5 points) In an opinion study, it is assumed that an unknown proportion of citizens – p – is in favour of a ne...

View details
• 统计时间序列代考 Stat 4603-5504代写 统计代考 时间序列代写

365

Stat 4603-5504 midterm test 统计时间序列代考 1. Let xt = 3.5 + 1.2xt−1 − .8xt−2 + wt where wt is a white noise with variance 5.5. (a) [5 marks] Identify xt as an ARMA(p, q). (i.e. find p, q and...

View details
• 代做留学统计作业 STAT 404代写 统计代写 R语言作业代写

158

STAT 404 - Assignment 4 Total marks: 45 + 1 bonus 代做留学统计作业 The data set in the file PaperHelicopter.txt was contributed by a volunteer student (one of you). Two factors are consider...

View details
• 统计数据分析代写 STA4003代写 R代写 时间序列代写

527

STA4003 Project 统计数据分析代写 The submitted codes must be clearly written in a R file with an output MSE. A report to describe your analysis is required. The submitted codes must ...

View details
1