什么是python代考？学习python是否有用？ 留学python代考 编程技术是目前计算机软件领域比较专业的领域，通过其延伸出来的工作很多，在很多行业中发挥重要作用。而目前比较简单实用的编程软件就是python了，...View details
Search the whole station
计算语言学代写 Late assignments will not be accepted without a valid medical certificateor other documentation of an emergency. This assignment is worth 33%
Late assignments will not be accepted without a valid medical certificateor other documentation of an emergency.
This assignment is worth 33% (CSC 485) or 25% (CSC 2501) of your final grade.
Dependency grammars posit relationships between “head” words and their modifiers. These relationships constitute trees where each word depends on exactly one parent: either another word or, for the head of the sentence, a dummy symbol, “ROOT”. The first part of this assignment concerns a parser that builds a parse incrementally. At each step, the state of the parse is represented by:
Initially, the stack only contains ROOT, the dependencies list is empty, and the buffer contains all words of the sentence in order. At each step, the parser advances by applying a transition to the parse until its buffer is empty and the stack is of size 1. The following transitions can be applied:
Complete the sequence of transitions needed for parsing the sentence “To ask those questions is to answer them” with the dependency tree shown below. At each step, indicate the state of the stack and the buffer, as well as which transition to apply at this step, including what, if any, new dependency to add. The first four steps are provided to get you started.
A sentence containing n words will be parsed in how many steps, in terms of n? (Exact, not asymptotic.) Briefly explain why.
Next, you will implement a transition-based dependency parser that uses a neural network as a classifier to decide which transition to apply at a given partial parse state. A partial parse state is collectively defined by a sentence buffer as above, a stack as above with any number of items on it, and a set of correct dependency arcs for the sentence.
Implement the complete and parse_step methods in the PartialParse class in parse.py. These implement the transition mechanism of the parser. Also implement get_ n_rightmost_deps and get_n_leftmost_deps. You can run basic (non-exhaustive) tests by running python3 test_parse.py.
Our network will predict which transition should be applied next to a partial parse. In principle, we could use the network to parse a single sentence simply by applying predicted transitions until the parse is complete. However, neural networks run much more efficiently when making predictions about “minibatches” of data at a time; in this case, that means predicting the next transition for many different partial parses simultaneously. We can parse sentences in minibatches according to algorithm 1.
Implement this algorithm in the minibatch_parse function in parse.py. You can run basic (non-exhaustive) tests by running python3 test_parse.py. 计算语言学代写
Training your model to predict the right transitions will require you to have a notion of how well it performs on each training example. The parser’s ability to produce a good dependency tree for a sentence is measured using an attachment score. This is the percentage of words in the sentence that are assigned as a dependant of the correct head. The unlabelled attachment score (UAS) considers only this, while the labelled attachment score (LAS) considers the label on the dependency relation as well. While this is ultimately the score we want to maximize, it is difficult to use this score to improve our model on a continuing basis.
we will use the model’s per-transition accuracy (per partial parse) as a proxy for the parser’s attachment score, so we will need the correct transitions. But the data set just provides sentences and corresponding dependency arcs. You must therefore implement an oracle that, given a partial parse and a set of correct, final target dependency arcs, provides the next transition to take to advance the partial parse towards the final solution set of dependencies. The classifier will later be trained to try to predict the correct transition by minimizing the error in its predictions versus the transitions provided by the oracle.
Implement your oracle in the get_oracle method of parse.py. You can run basic tests by running python3 test_parse.py.
The last step is to construct and train a neural network to predict which transition should be applied next, given the state of the stack, buffer, and dependencies. First, the model extracts a feature vector representing the current state. The function that extracts the features that we will use
has been implemented for you in data.py. This feature vector consists of a list of tokens (e.g., the last word in the stack, first word in the buffer, dependant of the second-to-last word in the stack if there is one, etc.). They can be represented as a list of integers:
where m is the number of features and each 0 ≤ wi < |V| is the index of a token in the vocabulary (|V| is the vocabulary size). Using pre-trained word embeddings (i.e., word vectors), our network will first look up the embedding for each word and concatenate them into a single input vector:
In model.py, implement the neural network classifier governing the dependency parser by filling in the appropriate sections; they are marked by BEGIN and END comments.
You will train and evaluate the model on a corpus of English Web text that has been annotated with Universal Dependencies. Run python3 train.py to train your model and compute predictions on the test data. With everything correctly implemented, you should be able to get an LAS of around 80% on both the validation and test sets (with the best-performing model out of all of the epochs at the end of training).
The transition-based mechanism above is limited to only being able to parse projective dependency trees. A projective dependency tree is one in which the edges can be drawn above the words without crossing other edges when the words, preceded by ROOT, are arranged in linear order. Equivalently, every word forms a contiguous substring of the sentence when taken together with its descendants. The tree in the above figure was projective. The tree below is not projective.
Why is the parsing mechanism described above insufficient to generate non-projective dependency trees?
Implement the is_projective function in graphalg.py. You can run some basic tests based on the trees in the handout by running python3 graphalg.py. Run the count_projective.py script and report the result.
A related concept to projectivity is gap degree. The gap degree of a word in a dependency tree is the least k for which the subsequence consisting of the word and its descendants (both direct and indirect) is entirely comprised of k +1 maximally contiguous substrings. Equivalently, the gap degree of a word is the number of gaps in the subsequence formed by the word and all of its descendants, regardless of the size of the gaps. The gap degree of a dependency tree is the greatest gap degree of any word in the tree.
What is the gap degree of each of the two trees above? Show your work.
Unlike the transition-based parser above, in this part of the assignment you will be implementing a graph-based parser that doesn’t have the same limitation for non-projective trees. In particular, you will be implementing an edge-factored parser, which learns to score possible edges such that the correct edges should receive higher scores than incorrect edges. A sentence containing n words has a dependency graph with n+1 vertices (with the one extra vertex being for ROOT); there are therefore (n+1)2 possible edges.
the vertices for the sentence words (including ROOT) will each be represented as vectors of size p. The vertices for the entire sentence can then be represented as a matrix S ∈ R(n+1)×p.1 Assume that ROOT is placed before the first word in the sentence.
1Beware that the vectors are computed differently compared to Part 1. There, we used word2vec; here we use BERT, a contextual representation model that provides a vector representation for a word in context of a sentence; in other words, the same word will have different vector representations depending on the sentence it appears in.
The model you are to implement has two components: an arc scorer and a label scorer. The arc scorer produces a single score for each vertex pair; the label scorer produces |R| scores for each vertex pair, where R is the set of dependency relations (det, dobj, etc.). The arc scores allow selecting an edge, after which the label score can be used to select a dependency relation for that edge.
Let aijdenote the arc score for dependency edge from vertex j to vertex i. Note the ordering! It may be different from what you’re used to: ai j is a score corresponding to j being the head of i; in other words, aijcorresponds to an edge from vertex j to vertex i. Then the matrix A with elements aij= [A]ijis defined as:
Implement create_arc_layers and score_arcs in graphdep.py. You will need to determine the appropriate dimensions for WA and bA, as well as how to implement the arc score calculations for the batched sentence tensors. You should be able to do this entirely with PyTorch tensor operations; using a loop for this question will result in a penalty.
For effective, stable training, especially for deeper networks, it is important to initialize weight matrices carefully. A standard initialization strategies for ReLU layers is called Kaiming initialization2 . For a given layer’s weight matrix W of dimension m × n, where m is the number of input units to the layer and n is the number of units in the layer, Kaiming initialization samples values Wijfrom a Gaussian distribution with mean µ = 0 and variance σ2 = 2/m.
2Also referred to as He initialization.
However, it is common to instead sample from a uniform distribution U (a,b) with lower bound a and upper bound b; in fact, the create_* methods in graphdep.py direct you to initialize some parameters according to a uniform distribution. Derive the values of a and b that yield a uniform distribution U (a,b) with the mean and variance given above.
The label scorer proceeds in a similar manner, but is a bit more complex. Let lij ∈ R|R|denote a vector of scores, one for each possible dependency relation r ∈ R. We use similar MLPs as for the arc scorer (but with different dimensionality), and then compute the score vector as:
Implement create_label_layers and score_labels in graphdep.py. As above, you will need to determine the appropriate dimensions for the trainable parameters and how to implement the label score calculations for the batched sentence tensors. You should be able to do this entirely with PyTorch tensor operations; using a loop for this question will result in a penalty.
In the arc scorer, the score function includes a term that has HA but not DA, but there isn’t a term that has the opposite inclusions (DA but not HA). For the label scorer, both are included. Why does it not make sense to include a term just for DA, but it does for DL?
In standard classification problems, we typically multiply the input by a weight matrix and add a per-class bias to produce a class score for the given input. Why do we have to multiply WA and WL by (transformed versions of) the input twice in this case?
There are some constraints on which arcs and arc-label combinations are possible. Implement these constraints in mask_possible in graphdep.py.
Finally, how do we make our final predictions given an input sentence? Since we know that we our desired answer has a tree structure, and since the parsing model is trained to increase the scores assigned to the correct edges, we can use a maximum spanning tree algorithm to select the set of arcs. Then, for each arc from i to j, we can use argmax over lijto predict the dependency relation.
Why do we need to use a maximum spanning tree algorithm to make the final arc prediction? Why can’t we just use argmax, like we would in a typical classification scenario, and like we do for the dependency relations? What happens if we use argmax instead?
Finish the rest of the implementation: is_single_root_tree and single_ root_mst in graphalg.py. You can refer to the pseudocode in the third edition of the Jurafsky & Martin textbook to guide your implementation.3
As in part 1, you will train and evaluate the model on a corpus of English Web text that has been annotated with Universal Dependencies. Run python3 train.py to train your model and compute predictions on the test data. With everything correctly implemented, you should be able to get an LAS of around 82% on both the validation and test sets (with the best-performing model out of all of the peochs at the end of training).4
3The name of the relevant algorithm is Chu-Liu-Edmonds; the pseudocode is given in Fig. 14.13 in the dependency parsing chapter, which you can download here.
4Keep in mind that the two parsers use different word vectors, so the LAS values that the two parsers get aren’t directly comparable.
Find at least three example sentences where the transition-based parser gives a different parse than the graph-based parser, but both parses are wrong as judged by the annotation in the corpus. For each example, argue for which of the two parses you prefer.
$ python3 train.py --debug
This will cause the code to run over a small subset of the data so that each training epoch takes less time. Remember to change the setting back prior to submission, and before doing your final run.
(i.e., assuming that you run your the command while your shell was in the same directory as train.py). Note that the GPU machines are shared resources, so there may be limits on how long your code is allowed to run.
The gpu-train.sh script should be sufficient; it takes care of submitting and running the job to the cluster-management software. If you like, you are free to manage this yourself; see the documentation online at the relevant page on the Teaching Labs site. The partition you can use for this class is csc485.
Remote GPU access via slurm is currently unavailable on teach.cs as of the release of this assignment due to an ongoing system upgrade. We will make an announcement on Piazza if this changes.
If you are accessing the labs in person, the Teaching Labs FAQ page lists the labs that have GPU-equipped machines.
Submission is electronic and can be done from any Teaching Labs machine using the submit command:
$ submit -c <course> -a A1 <filename-1> ... <filename-n>
where <course> is csc485h or csc2501h depending on which course you’re registered in, and <filename-1> to <filename-n> are the n files you are submitting. The files you are to submit are as follows:
Again, except for is_single_root_tree and single_root_mst, ensure that there are no changes outside the BEGIN and END boundary comments provided other than filling in your details at the top of the file; you will lose marks if there are. Do not change or remove the boundary comments either.
You will be using PyTorch to implement components of your neural dependency parser. PyTorch is an open-source library for numerical computation that provides automatic differentiation, making the back-propagation aspect of neural networks easier. Computation is done in units of tensors; the storage of and operations on tensors is provided in both CPU and GPU implementations, making it simple to switch between the two.
We recommend trying some of the tutorials on the official PyTorch site to get up to speed on PyTorch’s mechanics. In particular, the Tensors and nn module sections of the Learning PyTorch with Examples tutorial will be most relevant for this assignment.
If you want to run this code on your own machine, you can install the torch package via pip3, or any of the other installation options available on the PyTorch site. Make sure to use PyTorch version 1.9. You will also need to install the conllu and transformers packages.
Make sure to consult the PyTorch documentation as needed. The API documentation for torch, torch.nn, torch.nn.functional, and torch.Tensor will be useful while implementing the code in this assignment.
更多代写：python代写 加拿大雅思代考 英国Econ Online exam代考 Essay代写案例 dissertation学术论文代写 代写教学论文
合作平台：essay代写 论文代写 写手招聘 英国留学生代写
什么是python代考？学习python是否有用？ 留学python代考 编程技术是目前计算机软件领域比较专业的领域，通过其延伸出来的工作很多，在很多行业中发挥重要作用。而目前比较简单实用的编程软件就是python了，...View details
The Problem Python语言设计代写 In this assignment you will write an interpreter in Python for a simplified PostScript-like language, concentrating on key computational In this assignment y...View details
程序作业好写吗？python程序代写公司可以提供什么服务？ python程序代写 很多大学生都喜欢在毕业的时候，选择进入技术行业，因为it行业工资非常高。而为了能够获得就业机会，很多留学生为了学习前沿互联网技...View details
python代写哪里找的价位便宜?代写Python的通过率高不高？ python代写 随着人们生活水平的提高，人们的精神追求也在逐步的提高。对于出国留学都有了新的认识有了很高的思想觉悟，不再有人认为出国留学就是有钱...View details