Math 541 HW1 - Linear Algebra Refresher 数学线性代数代写 Remarks: A) Definition is just a definition, there is no need to justify or explain it. B) Answers to questions with proofs should b...View details
MATH5806 Applied Regression Analysis
应用回归分析作业代写 Note: • This assessment is due Thursday 30th June (Week 5), 2:30pm and must be uploaded to Moodle. You have 1.5 hours to complete this test.
• This assessment is due Thursday 30th June (Week 5), 2:30pm and must be uploaded to Moodle. You have 1.5 hours to complete this test.
• Please submit name your file in PDF format using the following name MS-z1234567-FirstName-Surname.pdf.
• Assignments without signed plagiarism declaration (below) will not be accepted and late assignments will not be accepted unless accompanied by medical certificates.
• This assignment weights for 15% of the final mark.
• There is a total of 1 exercise and 18 marks.
I declare that this assessment item is my own work, except where acknowledged, and has not been submitted for academic credit elsewhere. I acknowledge that the assessor of this item may, for the purpose of assessing this item reproduce this assessment item and provide a copy to another member of the University; and/or communicate a copy of this assessment item to a plagiarism checking service (which may then retain a copy of the assessment item on its database for the purpose of future plagiarism checking).
I certify that I have read and understood the University Rules in respect of Student Academic Misconduct.
Name Student No Signature Date
Exercise 1 – Analysis of United Nations data 应用回归分析作业代写
The UN11 data, available in the R package alr4, contains several statistics for 199 localities, mostly UN member countries but also other areas such as Hong Kong that are not independent countries.
The following variables are provided:
• region: the region of the world.
• group: a factor with levels oecd for countries that are members of the OECD, the Organization for Economic Co-operation and Development, as of May 2012, africa for countries on the African continent, and other for all other countries. No OECD countries are located in Africa.
• fertility: the average number of children per woman.
• ppgpd: the gross national product (GPD) per capita in U.S. dollars.
• lifeExpF: the female life expectancy (in years).
• pctUrban: Percent Urban.
The dataset can be loaded by running the following commands
1 library ( alr4 ) 2 data ( UN11 )
In this test we are interested in modelling the female life expectancy using some of the variables listed above and draw appropriate conclusions.
In the following questions, always provide your R code/command lines whenever prompted, marks will be automatically deducted otherwise. 应用回归分析作业代写
1. Let’s start by doing some explanatory data analysis. Display in three separate graphs, the female life expectancy as function of the GPD, log(GPD) and fertility, and briefly comment on what you observe. Provide your R code. [1.5 marks]
2. What percentage of the variations of the response do the variations of log(ppgpd) explain? Provide your R code. [0.5 mark]
3. What percentage of the variations of the response do the variations of fertility explain? Provide your R code. [0.5 mark]
4. Justify or argue against the following statement [1 mark]:
Both regressors explain some of the same variation in the response
6. For the model fitted in the previous question, what is the residual sum of squares? Provide your R code. [1 mark]
7. Write down the estimator of the true variance σ2 and compute it. Provide your R code. [1 mark]
10. Compute a 99% confidence interval for β1. Provide your R code. [1 mark]
Test the hypothesis H0 : β1 = β2 = 0 versus H1 : at least one of β1, β2 is non-zero at the level α = 0.01 of significance. First explain how the test statistic is constructed and write down your own code to calculate it. Then compute the critical value and draw your conclusion. Provide your R code. [2 marks]
12. What is the value of R2 the coefficient of determination? Does the model seem to explain well the variations in the price of the wines? Provide your R code. [0.5 mark]
13. Plot the standardised Pearson residuals versus fitted values and interpret the plot. [1 mark]
14. We are now interested in checking if there is a difference in the mean life expectancy of females in different regions (oecd, africa and other) while still considering log(ppgpd) and
(a) Let lifeExpFj,k denote the female life expectancy in the j-th country belonging to group k. Write down the model of interest and specify the number of free parameters. [1 mark]
(b) Write down the null hypothesis and its corresponding model.[1 mark]
(c) Describe the test that you would use to test the hypotheses and specify the distribution of the test statistic. [1 mark]