Search the whole station

预测分析代写 Predictive Analytics代写 Homework代写

ISE 529 Predictive Analytics

Homework 

预测分析代写 1.Use the Boston dataset from datasets to fit the following models to predict the price of houses in the Boston area. Read boston.data into a DataFrame using

1.

Use the Boston dataset from datasets to fit the following models to predict the price of houses in the Boston area. Read boston.data into a DataFrame using

X = pd.DataFrame(boston.data,columns = boston.feature_names).

Before modeling let us rename some features. Rename features RM and CHAS with

X.rename(columns = {’RM’:’MR’,’CHAS’:’HAS’},inplace=True). Then use X.columns.str[:1] to rename all 13 columns by their fifirst letter. Whenever needed (use random_state = 0 and default test, train sizes). Use sklearn to fifit all models.

a)

Use MinMaxScaler() to scale all features in (0, 1). Split the data into train and test set . Fit a linear regression model. Report test R2 , test MSPE.

b)

Use poly = PolynomialFeatures() to add 93 features in a new array X3.

  • Use poly.get_feature_names(X.columns) to review the names of the old and new columns.
  • Convert array X3 to a DataFrame using

X3 = pd.DataFrame(X3,columns=poly.get_feature_names(X.columns))

We will call X3 the extended Boston dataset.

  • Use MinMaxScaler() to scale all 104 features in (0, 1), call it X4.
  • Split X4 into train and test set. Use these sets for all of the following parts in this homework.

Fit a linear regression model. Report test  , test MSPE.

预测分析代写

c)

Use Ridge(alpha = 0.1,normalize = True).fit(X_train,y_train) to fit a RR model.

Notice that we are normalzing the already scaled data in X4. Report test  , test MSPE.

d)

Search for the best alpha value then fifit the RR model again. Report test R2 , test MSPE.

e)

Fit a Random Forest model on 500 trees with max_features = 10, max_depth = 6, random_state=0. Report test  , test MSPE.

f)

Find most important features in the extended Boston datasetidentifified by the RF. Report the top seven. What original features are most important?

g) 

Use GridSearchCV to fifind best values for max_features, max_depth. Fit the RF with these values and report test  , test MSPE.

A real estate appraiser is interested in predicting residential home prices as a function of various features. Therefore regression models are to be constructed to predict houses prices. The homes.csv data set (available from blackboard) is a sample of 522 residential houses. Reduce the dataset to houses with two to five bedrooms, style 1 to 7, and houses not close to a highway. Then remove column highway. The reduced data set should have 485 rows. Remember that style should be categorical for this homework.

2. 预测分析代写

Fit a full multiple regression model (with all numerical variables as predictors only).

a) Find the largest outlier (in absolute value).

b) Plot y ˆy labeling the largest outlier.

c) Find the predicted price when all predictors are equal to their median values.

Now consider all variables (categorical and numerical) as predictors from the reduced data set (code for the following questions may take around 10 minutes of CPU time).

3.

Report the predictions of the best AIC and best BIC models when predicting the price of a high quality, style 3 house with ac, garage for two cars, area of 2100 square feet, built in 1992, 24500 square feet lotsize, no pool, three bedrooms and three bathrooms.

Submit a your report in a single pdf file (convert your ipynb file into a pdf).

国际货币经济学代写
预测分析代写

更多代写:汇编代写  GRE代考  英国高分Quiz代写  法律学essay代写  法语论文代写价格 留学生代写

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

The prev: The next:

Related recommendations

1
您有新消息,点击联系!