CS考试代考 Python代写 linear model代写 cs考试助攻

CSCI 3022 Final Exam

CS考试代考 Read the following: • RIGHT NOW! Write your name, student ID and section number on the top of your exam. If you’re handwriting your exam,

Read the following:

• RIGHT NOW! Write your name, student ID and section number on the top of your exam. If you’re handwriting your exam, include this information at the top of the first page!

• You may use the textbook, your notes, lecture materials, and Piazza as resources. Piazza posts should not be about exact exam questions, but you may ask for technical clarifications and ask for help on review/past exam questions that might help you. You may not use external sources from the internet or collaborate with your peers.

• You may use a calculator or Python terminal to check numerical results.

• If you print a copy of the exam, clearly mark answers to multiple choice questions in the provided answer box. If you type or hand-write your exam answers, write each problem on their own line, clearly indicating both the problem number and answer letter.

• Mark only one answer for multiple choice questions. If you think two answers are correct, mark the answer that best answers the question. No justification is required for multiple choice questions. For handwriting multiple choice answers, clearly mark both the number of the problem and your answer for each and every problem.

• For free response questions you must clearly justify all conclusions to receive full credit. A correct answer with no supporting work will receive no credit.

• When submitting your exam to Gradescope, use their submission tool to mark on which pages you answered specific questions. Submitting your exam properly is worth 1/100 points. The other problems sum to 99.

Multiple choice problems CS考试代考

1. (3 points)

Which of the following tests are appropriate for the goal of establishing uncertainty or confidence intervals on the interquartile range of a probability process?

A. Because of the central limit theorem, we can use normals for any problem like this.

B. For small samples, we can create a confidence interval on teh interquartile range via t-distributions.

C. We could simulate the probability process and bootstrap to find confidence intervals for an interquartile range.

D. None of the above will work: we’d have to use exact sums or integrals of the probability process to get theoretically consistent estimates.

2. (3 points)

You are sampling the weights of various puppies from a population with a known mean of 15 pounds and variance of 16 pounds² . You obtain a measurement from an adorable Beagle of X = 19 pounds. What is the corresponding value of the standardized normal random variable, Z?

A. 0.25

B. 0.5

C. 1

D. 19/16

E. 2

F. 19/4

G. 15

A. A t– distribution is appropriate here, a standard normal is not.

B. A standard normal distribution is appropriate here, a t is not.

C. Neither a t not a standard normal is appropriate.

D. Either a t or a standard normal is suitable for this problem.

4.(3 points)

A random variable U has a standard deviation equal to s_{_U}, and a random variable V has a standard deviation equal to s_{_V} . U and V are independent. Let W = U + V . What is the standard deviation of W?

Use the following information for Problems 5 – 8. CS考试代考

You’re performing a simple linear regression, and someone spills ink all over your beautiful regression table. Now you can only read the following, though you also do recall that the data set had 147 observations, and most of their x-values were close to x = 1:

Coefficient	Estimate	Std. error	t-value	Pval
(Intercept)Slope	3.45.56	MISSING0.19	3.1MISSING	MISSING< 2e-16

5. (3 points)

What is the correct (exact) MISSING value for the intercept’s p-value?

A. stats.t.ppf(3.1, df=147)

B. 2(1-stats.t.cdf(3.1, df=145))

C. 3.4 · 3.1

D. 2(1-stats.t.cdf(3.1, df=147))

E. (1-stats.norm.cdf(3.1))

6. (3 points)

From the same table as the prior question, what is the (exact) MISSING value for the t-value of the slope?

A. 5.56/0.19

B. stats.t.ppf(5.56)- stats.t.ppf(0.19)

C. stats.t.ppf(5.56, df=145)- stats.t.ppf(0.19, df=145)

D. stats.norm.ppf(2e-16)

E. 5.56 · 0.19

7. (3 points) CS考试代考

Suppose we were to take this linear regression and add in an additional predictor of x² , so the model became y = β₀ + β₁x + β₂x² + ε. Which of the following best describes all of the results of this added parameter?

A. The R² of the model will increase.

B. The standard error associated with β₁ will increase.

C. Because the coefficient for β1 was so significant, it’s unlikely that adding x² will help the model.

D. Only (A) and (B) are true.

E. Only (A) and (C) are true.

F. Only (B) and (C) are true.

G. All of (A), (B), and (C) are true.

8. (3 points)

Suppose we found ourselves a new data value at the (x, y) location of (12, 50). What would be the effects of this data point on the resulting line of best fit?

A. The estimate for β₁ will increase.

B. The standard error associated with β1 will increase.

C. The total SSE of the model will decrease.

D. Only (A) and (B) are true.

E. Only (A) and (C) are true.

F. Only (B) and (C) are true.

G. All of (A), (B), and (C) are true.

9. (3 points)

Suppose you compute a sample mean for a population that is normally distributed with known variance σ² . Which combination of significance level and sample size n produces the narrowest confidence interval for the mean?

A. α = 0.2 and n = 50

B. α = 0.2 and n = 12

C. α = 0.01 and n = 50

D. α = 0.01 and n = 12

E. α = 0.04 and n = 50

F. α = 0.04 and n = 12

10. (3 points) CS考试代考

Data Scientists are often involved in study planning. You are in charge of a study that examines the mean lifetime (in years) of different cars. You know that the standard deviation of the lifetime of cars is σ = 1.2 years. What value of n do you need for the maximum 95% confidence interval width to be at most 0.5 years?

A. 9

B. 10

C. 22

D. 23

E. 30

F. 88

G. 89

H. 100

I. 101

J. 1000

11. (3 points)

Which of the following statements is True?

A. You cannot make a Type I error when the null hypothesis is false.

B. You cannot make a Type II error when the null hypothesis is false.

C. The test that minimizes Type I error rate α will also be the one that minimizes the Type II error rate β.

D. The p-value is the probability that the null hypothesis is true.

E. The larger the p-value, the more we doubt the null hypothesis.

12. (3 points)

Consider performing a multiple linear regression on a data-set with full and reduced models of the form y = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + β₄x₄ and y = β₀ + β₁x₁ + β₄x₄, respectively. Suppose that you perform a partial F test and reject the null hypothesis. What is the strongest conclusion you can draw?

A. Nothing.

B. β_{_k} ≠ 0 for some k ∈ {2, 3}.

C. β_{_k} = 0 for all k ∈ {1, 2, 3, 4}.

D. β_{_k}≠ 0 for all k ∈ {2, 3}.

E. β₁ = β₄ = 0.

F. β₂ = β₃ = 0.

G. The model with β1 = β4 ≠ 0 is not significantly better at capturing variance than the model with β1 = β4 = 0

13. (3 points)

Suppose you generate 5,000 confidence intervals for the mean of a population, using fixed significance level α. You discover that 491 of them FAIL to cover the true mean. Which of the following is the most appropriate estimate of the significance level α?

A. 0.01

B. 0.025

C. 0.05

D. 0.1

E. 0.2

F. It’s 50-50.

G. 4509/5000

Free Response problems CS考试代考

14. (30 points)

Answer the following short answer prompts after each question.

A. (6 points)

What is the difference in how we interpret the pdf f(x) of a continuous random variable and the pmf f(x) of a discrete random variable? Do they have the same units? What does each measure?

B. (8 points)

Suppose that a sample X₁, X₂, . . . X₆₅ comes from a population with an unknown distribution. The population has a mean of 42 and a standard deviation of 12. Find the probability that the sample mean is between 40 and 47. Write your answer three ways: in critical value notation (using e.g. t_{_α,ν}, z_{_α}₂ ) and with exactly how you would find those values using python code (using scipy.stats syntax:.ppf, .cdf, .pdf, etc.) and the exact interval. Is this answer exact or an approximation? Why or why not?

C. (8 points)

Suppose we’re constructing a linear model to test the reaction of COVID vaccines, and suspect their may be a difference based on the sex of the recipient. We decide on the model:

y_{_i} = β₀ + β₁ · W_{_i} + β₂ · Mi+ ε_{_i}

where y_{_i} is the strength of response of patient i, W_{_i} is that patient’s weight in pounds, and M_{_i} is an indicator or dummy variable that is true when the patient is a Male. We gather some data, and the first 6 patients are {(165, M),(125, W),(220, M),(145, M),(150, W),(185, W)}. What are the first 6 rows of the corresponding design matrix?

D. (8 points)

You have 12 observations drawn from a normal distribution with unknown parameters, and want to test the hypothesis Ho : µ = µ₀ vs. Ha : µ > µ₀. You will reject the null hypothesis if your test statistic is greater than 1.84. What is the probability of a Type I error for your test? Write your answer both in critical value notation (using e.g. t_{_α,ν}, z_{_α}₂ ) and with exactly how you would find those values using python code (using scipy.stats syntax:.ppf, .cdf, .pdf, etc.) and the exact probability.

15. (14 points) CS考试代考

An e-commerce client claims that more than 20 percent of visitors to their site eventually become buyers (tracked by a cookie). Test this claim if a random check of the web server log indicates that 54 of 200 visitors made a purchase from the web site. In testing this claim, follow these four steps:

1. (5 points) Construct and interpret the 95% confidence interval for the true proportion of customers who become buyers.

2. (2 points) Write down the two hypotheses being tested, and define any parameter used in the hypotheses.

3. (5 points) Calculate the appropriate test statistic and its corresponding p-value.

4. (2 points) Using the p-value, decide whether or not to reject the null hypothesis at the 1% significance level and interpret your conclusion in terms of the original problem.

16.(16 points)

It’s line-drawing time! The next 4 questions refer to the following plot, with fitted least-squares line by model Y_{_i} = β₀ + β₁X_{_i} + ε_{_i} shown:

Note that the fitted line goes through exactly (0,0), here. The above plot was created by running slr=sm.OLS(x,y).fit() on a pair of numpy arrays holding the x and y coordinates of the data. For questions (a), (b), and (c), if the statement is always true mark “True”; if it is possible for the statement to be false, mark “False.” You need to justify your answer with (at least) a full sentence:

合作平台：essay代写论文代写写手招聘英国留学生代写

The prev: 离散结构代写 Discrete Structures代写 CS代写 Exercise代写The next: 线上商业分析代写 BUSINESS ANALYTICS代写数据分析代写

Related recommendations

物理攻击与对策代写 CS579/ECE599代写算法代写 Python代写
2023/04/06 241
CS579/ECE599 Assignment Homework 1 物理攻击与对策代写 Please complete this assignment (100 pts total) and submit your report/program code on Canvas (all files compressed in one .zip witho...
View details
数值分析课业代写 Program代写 Python代写数值分析代写
2022/09/15 398
Programming assignment #3: eigenvalues 数值分析课业代写 Problem 1. Program the power method and inverse power method to compute the maximum and minimum eigenvalue/eigenvector pairs of the symme...
View details
图像处理代写 image processing代写 python代写
2022/02/16 772
image processing python Implementation (50 %) 图像处理代写 1 DoG (20 %) Use two Gaussian filters with suitable kernel sizes to to extract the license plate and remove most of the background ...
View details
数据科学考试代考 CSCI 3022代写数据考试代考 Python代写
2022/09/17 307
CSCI 3022 Midterm Exam 数据科学考试代考 Read the following: You may use a calculator provided that it cannot access the internet or store large amounts of data.You may NOT use a Read the fol...
View details