Homework 2 FE621 Computational Methods in Finance 金融计算方法代写 For all the problems in this assignment you need to write computer programs an For all the problems in this assignment you ...View details
数据科学代写 Submission 1: Report Submit a written report on your work, as a PDF document. • This should be submitted through the link in the Canvas site.
Submission 1: Report
Submit a written report on your work, as a PDF document.
• This should be submitted through the link in the Canvas site.
• The report should have two distinct parts.
Report Part One:
Aimed at a general audience that is interested in the domain and also that they are interested in how machine learning can impact on the domain or on people interested in the domain. (for example, if your data set is about pulsars, assume the readers are like those of a popular science article on pulsars)
In this part, you should focus on the insight about the domain that was gained from the predictive model. And on the possible impact on the domain from machine learning:
• describe the domain situation,
• identify one question that you are looking to explore, concerning possible capacity to predict one feature of the domain, based on some other features
• explain what impact might arise from effective prediction of this feature
• present what your model has revealed about the domain (in particular, what your predictive model tells about the question you were exploring). For example, you might indicate that your model shows that one feature is especially influential in predicting the target feature, or that an increase of 1 unit in an input feature results in an increase of X units for the target feature, etc. It is also quite ok, if the message here is that you were not able to make good predictions about the feature!
Report Part Two: 数据科学代写
Aimed at people with interest in IT techniques for machine learning (such as other students in data1002!);
In this part, you should explain how you produced one or more predictive models (what tools you used to generate the model. And any relevant settings, including what training data you provided) and how you evaluated these models.
• you should include the code (or at least all the key parts), both for producing the models, and for evaluating them.
• You should explain why you chose the particular type of predictive model, and the particular approach to training the model.
• You should explain why you chose the particular approach to evaluation. If your evaluation used specific scoring measures, you should state the scores obtained in evaluation, for each predictive model you have produced.
• You should discuss strengths and limitations of these predictive
models. In particular, if you have produced two models, you should discuss which is better for the purpose as evidenced by your experience. (not simply general comments about the different approaches, as found in lectures etc)
Submission 2: Your Source Code 数据科学代写
Submit a copy of the source code that you wrote to create the predictive model and to evaluate it. This should be submitted through the link in Canvas, as a single file.
In most cases, we expect you to submit a Python program (or Jupyter notebook with code). Make sure the code is easy for a reader to understand. If the work is done in several files (for example, separate file for each type of predictive model), then you should put all the files in a directory. And then compress/archive the directory into a single file which you submit.
Submission 3: Your Clean Data
Submit a single file with the clean data that you used. This could be a csv file or json etc, but If your data was spread among several files, you need to compress/archive them into a single file and submit that.
Here is the mark scheme for this assignment. The marker’s evaluation will be made principally on the basis of your report; the submitted code and data may be considered as evidence to check or clarify statements made in the report. Note that all members of the group receive the same mark.
Communication to a general reader: 2 Marks 数据科学代写
The way you communicate ideas about predictive models, for general readers (as shown in Part 1 of the report).
• A pass score indicates that Part 1 of the report allows the intended audience to gain knowledge about some predictive model for an aspect of the domain, without excessive effort or confusion
• A distinction score indicates that Part 1 of the report is welltargeted to make it easy for the intended audience to gain understanding about machine learning and its value in connection with the domain
o this includes clearly linking your writing to the audience’s background and aims
o and the structure needs to be logical and well-organised
o and the particular model(s) discussed need to be clearly connected to the general issue of ML in the domain
• Full marks is awarded where Part 1 of the report meets all the Distinction criteria, and it draws the reader in and engages their attention for issues that are important to them
o Also, it leaves the readers clear about further questions where predictive models might be useful in understanding the domain.
Predictive model: 1 Mark 数据科学代写
The predictive model(s) you have produced (as shown in Part 2 of the report). Note: you will not be penalized in marks if you use a reasonable approach to build a predictive model, but it does not predict very well.
• A pass score indicates that you have correctly used Python (we strongly suggest doing it with the scikit-learn package) to build a reasonable predictive model for some feature in the dataset, based on values of other features.
o as part of this, you need to include clear descriptions of the approach you used.
• A distinction score indicates that you have correctly used two different techniques to build predictive models for the same feature (this could be using quite different techniques, or just different hyper-parameter or other settings in the training, or even just choosing different sets of input/independent features to use in making the prediction), and that each is explained clearly
• Full marks is awarded for a report that o meets all the Distinction criteria, and also it
o used some machine learning technique (either a type of model, or a way to do training) that goes beyond those discussed in data1002 lectures and Grok resources.
Evaluation of Predictive model: 2 Marks
The evaluation of the predictive model(s) you have produced (as shown in Part 2 of the report). 数据科学代写
Note: you will not be penalized in marks if you use a reasonable approach to build a predictive model, but you find that it does not predict very well.
• A pass score indicates that you have provided some reasonable evaluation of the success of each model you produced, showing how well (or not so well!) the model does in predicting the target feature, on data that was not used in training. If you use a numerical scoring measure in the evaluation. You need to describe what this measure says about the predictions.
• A distinction score indicates that you have correctly used two different techniques to build predictive models for the same feature (this could be using quite different techniques, or just different hyper-parameter or other settings, or even just choosing different sets of input/independent features to use in making the prediction). That each model is evaluated reasonably using the same evaluation method. And that you have given a reasonable comparison of how well the two models predict
• Full marks is awarded for a report that meets all the Distinction criteria. Where the evaluation considers several different qualities of the models (eg it is not just measuring the accuracy of prediction). And also it offers meaningful insights into the reasons why a user might choose one model rather than the other.
During the project, you need to manage the work among the group members. We recommend that you do NOT allocate a different kind of work to each person. That is, don’t get one member to write code, another to produce graphs, another to write text, etc.
Instead, we recommend that every person do each activity (perhaps for exploring the relationships of a different group of attributes). This will be important for preparing each member for the final exam.
As announced in CUSP: Late work (without approved special consideration or arrangements) suffers a penalty of 5% of the available marks, on each calendar day after the due date.
No late work will be accepted more than 10 calendar days after the due date.
Draft submission: The “draft” is not worth marks, but its purpose is to be the basis for feedback from your tutors during the week 11 lab session, on the reasonableness of the approach you plan to take.
Submit (at the Canvas link for the draft) a short description of the predictive model(s) you intend to produce. Especially indicate which features will be inputs and which feature will be predicted. Also, indicate how you intend to produce each model (eg which scikitlearn methods you will use).