A Method for Learning Large-Scale Computational Construction Grammars from Semantically Annotated Corpora
This interactive notebook accompanies the following paper:
Authors. (submitted). A Method for Learning Large-Scale Computational Construction Grammars from Semantically Annotated Corpora.
Abstract: We present a method for learning large-scale, broad-coverage construction grammars from corpora of language use. Starting from utterances annotated with constituency structure and semantic frames, the method facilitates the learning of human-interpretable computational construction grammars that capture the intricate relationship between syntactic structures and the semantic relations they express. The resulting grammars consist of networks of tens of thousands of constructions formalised within the Fluid Construction Grammar framework. Not only do these grammars support the frame-semantic analysis of open-domain text, they also house a trove of information about the syntactico-semantic usage patterns present in the data they were learnt from. The method and learnt grammars contribute to the scaling of usage-based, constructionist approaches to language, as they corroborate the scalability of a number of fundamental construction grammar conjectures while also providing a practical instrument for the constructionist study of English argument structure in broad-coverage corpora.
This notebook relies on the fcg-propbank subsystem for learning large-scale construction grammars from PropBank-annotated corpora. It demonstrates (i) how a pretrained grammar comprising tens of thousands of constructions can be loaded into an FCG agent and used to extract semantic frames from open-domain text, (ii) how we can mine the grammar network for retrieving linguistic insights, and (iii) how a new grammar can be learnt from annotated data.
[ ]:
# Run this cell if you have not yet installed these packages
! pip install pyfcg
! pip install nltk
[1]:
import pyfcg as fcg
fcg.init(launch=False)
Loading a pre-trained grammar
We start by creating an FCG agent. In this case, the agent is an instance of the fcg.PropBankAgent class, a subclass of the fcg.Agent class provided by the fcg-propbank subsystem.
[2]:
propbank_agent = fcg.PropBankAgent()
We release three pretrained, precompiled FCG grammars for English argument structure that were trained using the method described in the paper. Once you have selected the pretrained grammar of your choice, you can load it into our agent using its load_grammar_image method. The loading can take up to 30 seconds.
[3]:
# Grammar trained on full training corpus, all roleset instances
pretrained_grammar_image_full = '/Users/katrien/Desktop/propbank-learned.fcg'
# Grammar trained on full training corpus, only roleset instances that have at least one role apart from the V-role.
pretrained_grammar_image_filtered_single_role_rolesets = '/Users/katrien/Desktop/propbank-learned-no-single-role-frames.fcg'
# Grammar trained on full training corpus, hapaxes removed after training (constructions with frequency of 1)
pretrained_grammar_image_no_hapaxes = '/Users/katrien/Desktop/propbank-learned-no-hapaxes.fcg'
[5]:
propbank_agent.load_grammar_image(pretrained_grammar_image_filtered_single_role_rolesets) #change argument if you want to load another image
propbank_agent
[5]:
<Agent: agent (id: agent-25) ~ 39046 constructions>
Let us have a look at the distribution of constructions in the grammar:
[6]:
def get_all_cxns_of_type(grammar, type):
filtered_cxns = []
for cxn in list(grammar.cxns.values()):
if cxn.attributes['label'] == type:
filtered_cxns.append(cxn)
return filtered_cxns
argst_cxns = get_all_cxns_of_type(propbank_agent.grammar, 'argument-structure-cxn')
fe_cxns = get_all_cxns_of_type(propbank_agent.grammar, 'lexical-cxn')
roleset_cxns = get_all_cxns_of_type(propbank_agent.grammar, 'word-sense-cxn')
print('Number of frame-evoking cxns:')
display(len(fe_cxns))
print('Number of argument structure cxns:')
display(len(argst_cxns))
print('Number of roleset cxns:')
display(len(roleset_cxns))
Number of frame-evoking cxns:
8626
Number of argument structure cxns:
22535
Number of roleset cxns:
7885
[7]:
print('Number of nodes in the grammar network:')
display(len(propbank_agent.grammar.categorial_network['nodes']))
print('Number of links in the grammar network:')
len(propbank_agent.grammar.categorial_network['edges'])
Number of nodes in the grammar network:
36014
Number of links in the grammar network:
[7]:
720844
When it comes to their frequency of occurrence, the constructions in the grammar network follow a Zipfian distribution, i.e. the same distribution as the one observed for the lexical items of a language. This means that the frequency of occurrence of a construction is approximately inversely proportional to the rank of the construction in a table in which all constructions are sorted by decreasing frequency. Interestingly, this observation also holds for each group of constructions in isolation, including the group of argument structure constructions, which are not tied to specific lexical items or other substantive material. You can run the following cell to inspect the Zipfian distribution of the constructions in the grammar network.
[8]:
import matplotlib.pyplot as plt
argst_cxns.sort(key=lambda cxn: cxn.attributes['score'], reverse=True)
fe_cxns.sort(key=lambda cxn: cxn.attributes['score'], reverse=True)
roleset_cxns.sort(key=lambda cxn: cxn.attributes['score'], reverse=True)
argst_freqs = [cxn.attributes['score'] for cxn in argst_cxns]
fe_freqs = [cxn.attributes['score'] for cxn in fe_cxns]
roleset_freqs = [cxn.attributes['score'] for cxn in roleset_cxns]
# plot:
fig, ax = plt.subplots(layout='constrained',figsize=(4,4))
ax.loglog(argst_freqs, 'xkcd:blue', label='Argument-structure cxns')
ax.loglog(fe_freqs, 'xkcd:sky blue', label='Frame-evoking cxns')
ax.loglog(roleset_freqs, 'xkcd:grass green', label='Roleset cxns')
fig.legend(loc='outside upper right')
plt.show()
Grammar network analysis
The constructions of the grammar network are interconnected through categorial links, of which the weights indicate how often each combination of constructions has been observed in the training corpus. The network thereby holds a trove of empirical information about the frequency of particular syntactico-semantic usage patterns. Let us consider for instance the ditransitive double object construction. We can straightforwardly query the grammar network for those rolesets that are most typically associated with this construction:
[9]:
# The ditransitive construction maps the ARG0 role to an NP, the V role to a Verb,
# the ARG2 role to an NP and the ARG1 role to another NP.
schema = [(':arg0', 'np'),
('v', 'v'),
(':arg2', 'np'),
(':arg1', 'np'),]
# Now we retrieve all rolesets that were observed to fill the V slot of this schema, along with their frequency:
propbank_agent.retrieve_rolesets_for_schema(schema)
[9]:
{'give.01': 650.0,
'tell.01': 128.0,
'show.01': 62.0,
'send.01': 51.0,
'ask.01': 32.0,
'pay.01': 28.0,
'bring.01': 27.0,
'grant.01': 22.0,
'owe.01': 15.0,
'teach.01': 13.0,
'provide.01': 11.0,
'wish.01': 10.0,
'sell.01': 9.0,
'charge.01': 9.0,
'do.02': 8.0,
'feed.01': 7.0,
'lend.01': 6.0,
'write.01': 6.0,
'hand.01': 5.0,
'deny.01': 5.0,
'save.01': 4.0,
'award.01': 3.0,
'leave.12': 3.0,
'find.01': 3.0,
'take.10': 2.0,
'email.01': 2.0,
'fax.01': 2.0,
'deliver.01': 2.0,
'extend.02': 2.0,
'allow.02': 2.0,
'earn.01': 2.0,
'guarantee.01': 2.0,
'mail.01': 2.0,
'leave.02': 2.0,
'fine.01': 2.0,
'cc.01': 1.0,
'will.01': 1.0,
'deal.02': 1.0,
'serve.02': 1.0,
'name.01': 1.0,
'win.01': 1.0,
'sing.01': 1.0,
'call.01': 1.0,
'assign.01': 1.0}
While tell.01 (pass along information) integrates frequently with the double object construction, the tell.02 roleset (distinguish, determine) never does. The network reveals that this roleset is instead strongly associated with constructions that syntactically express its ARG1 role by means of a subclause (”couldn’t tell whether…”, “can tell that …”).
[10]:
propbank_agent.retrieve_cxn_schemata_for_roleset('tell.01')
[10]:
{'arg0(np)+v(v)+arg2(np)+arg1(sbar)': 323.0,
'arg0(np)+v(v)+arg2(np)+arg1(s vp)': 168.0,
'arg0(np)+v(v)+arg2(np)+arg1(sbar s)': 151.0,
'arg0(np)+v(v)+arg2(np)+arg1(np)': 128.0,
'arg0(np)+v(v)+arg2(np)': 98.0,
'arg0(np)+v(v)+arg2(np)+arg1(pp)': 87.0,
'arg0(np)+v(v)+arg1(np)': 69.0,
'arg1(wp)+arg0(np)+v(v)+arg2(np)': 56.0,
'arg0(np)+v(v)+arg2(np)+arg1(s)': 44.0,
'v(v)+arg2(np)+arg1(sbar)': 31.0,
'v(v)+arg2(np)+arg1(np)': 24.0,
'arg0(np)+v(v)+arg1(pp)': 21.0,
'arg0(np)+v(v)+arg2(np)+arg1(dt)': 20.0,
'v(v)+arg1(np)': 18.0,
'arg1(np)+arg0(np)+v(v)+arg2(np)': 18.0,
'v(v)+arg2(np)+arg1(pp)': 17.0,
'arg2(np)+v(v)+arg1(s vp)': 17.0,
'arg2(np)+v(v)+arg1(sbar s)': 16.0,
'arg2(np)+v(v)+arg1(sbar)': 16.0,
'arg1(np)+v(v)': 16.0,
'v(v)+arg2(np)+arg1(s vp)': 15.0,
'v(v)+arg2(np)+arg1(sbar s)': 13.0,
'v(v)+arg2(np)+arg1(s)': 13.0,
'v(v)+arg2(np)': 12.0,
'arg0(np)+v(v)+arg1(np)+arg2(pp)': 12.0,
'arg1(np)+r-arg1(wdt)+arg0(np)+v(v)+arg2(np)': 12.0,
'arg0(np)+v(v)+arg1(sbar)': 11.0,
'arg0(np)+v(v)': 11.0,
'arg1(np)+v(v)+arg2(np)': 9.0,
'arg1(s)+arg0(np)+v(v)+arg2(np)': 9.0,
'arg1(np)+v(v)+arg2(pp)': 8.0,
'arg1(np)+arg0(np)+v(v)+arg2(np)+c-arg1(in)': 7.0,
'arg0(dt)+r-arg0(wp)+v(v)+arg1(np)': 6.0,
'arg0(np)+v(v)+arg2(np)+arg1(v)': 5.0,
'arg2(np)+v(v)+arg1(np)': 5.0,
'arg2(np)+v(v)': 5.0,
'arg1(np)+arg0(np)+v(v)': 5.0,
'arg1(np)+v(v)+arg0(pp)': 5.0,
'arg0(np)+v(v)+arg1(nns)': 4.0,
'arg1(dt)+arg0(np)+v(v)+arg2(np)': 4.0,
'arg1(wp)+arg0(np)+v(v)+arg2(np)+c-arg1(pp)': 4.0,
'arg1(nn)+arg0(np)+v(v)+arg2(np)': 4.0,
'arg1(wp)+arg0(np)+v(v)': 4.0,
'arg2(np)+v(v)+arg0(pp)+arg1(sbar)': 4.0,
'arg0(np)+v(v)+arg2(prp)+arg1(dt)': 4.0,
'arg1(wdt)+arg0(np)+v(v)+arg2(np)': 4.0,
'arg0(np)+r-arg0(wp)+v(v)+arg2(np)+arg1(pp)': 4.0,
'arg0(np)+r-arg0(wp)+v(v)+arg1(np)': 4.0,
'arg0(dt)+v(v)+arg2(np)+arg1(np)': 3.0,
'arg0(nn)+r-arg0(wp)+v(v)+arg2(np)+arg1(np)': 3.0,
'arg0(prp)+v(v)+arg2(np)+arg1(sbar s)': 3.0,
'arg0(np)+v(v)+arg2(np)+arg1(wp)': 3.0,
'arg2(np)+v(v)+arg1(s)': 3.0,
'arg1(np)+r-arg1(wdt)+arg0(np)+v(v)+arg2(pp)': 3.0,
'arg0(prp)+v(v)+arg2(np)+arg1(sbar)': 3.0,
'arg0(np)+v(v)+arg2(np)+arg1(jj)': 3.0,
'v(v)+arg1(dt)': 3.0,
'arg0(np)+v(v)+arg2(np)+arg1(in)': 3.0,
'arg1(dt)+v(v)': 3.0,
'arg2(np)+arg0(np)+v(v)': 3.0,
'arg0(np)+v(v)+arg2(np)+arg1(rb)': 3.0,
'arg0(np)+v(v)+arg2(prp)+arg1(np)': 3.0,
'arg0(np)+r-arg0(wp)+v(v)+arg2(np)+arg1(sbar)': 3.0,
'arg1(np)+r-arg1(wdt)+arg0(np)+v(v)+arg2(np)+c-arg1(in)': 3.0,
'arg0(np)+r-arg0(wp)+v(v)+arg2(np)+arg1(sbar s)': 2.0,
'arg1(np)+r-arg1(in)+arg0(np)+v(v)': 2.0,
'v(v)+arg1(sbar)': 2.0,
'arg0(np)+v(v)+arg2(np)+arg1(wrb)': 2.0,
'arg0(wp)+v(v)+arg1(np)': 2.0,
'arg1(np)+r-arg1(wdt)+v(v)': 2.0,
'arg0(nn)+r-arg0(wp)+v(v)+arg2(np)+arg1(pp)': 2.0,
'v(v)+arg2(np)+arg1(sq)': 2.0,
'arg0(np)+r-arg0(wdt)+v(v)+arg2(np)+arg1(pp)': 2.0,
'arg1(wp)+arg2(np)+v(v)': 2.0,
'arg1(np)+arg2(np)+v(v)': 2.0,
'arg1(np)+r-arg1(wdt)+v(v)+arg2(pp)': 2.0,
'v(v)+arg1(np)+arg2(pp)': 2.0,
'arg1(s)+arg2(np)+v(v)': 2.0,
'v(nn)+arg1(dt)': 2.0,
'arg1(whnp)+v(v)+arg2(np)': 2.0,
'arg1(nn)+r-arg1(wdt)+arg0(np)+v(v)+arg2(np)': 2.0,
'arg0(dt)+r-arg0(wp)+v(v)+arg2(np)+arg1(sbar)': 2.0,
'arg0(np)+v(v)+arg1(dt)': 2.0,
'arg2(np)+r-arg2(wdt)+v(v)+arg1(pp)': 1.0,
'arg0(nnp)+v(v)+arg2(np)+arg1(sbar)': 1.0,
'arg1(jj)+v(v)+arg2(np)': 1.0,
'arg0(np)+v(v)+arg1(pp)+arg2(pp)': 1.0,
'arg0(dt)+r-arg0(wp)+v(v)+arg2(np)+arg1(np)': 1.0,
'arg1(sbar)+v(v)': 1.0,
'arg1(np)+v(v)+arg2(np)+c-arg1(in)': 1.0,
'arg0(dt)+v(v)+arg1(np)': 1.0,
'arg0(nn)+v(v)+arg2(np)+arg1(pp)': 1.0,
'arg0(prp)+v(v)+arg1(np)': 1.0,
'arg0(np)+v(v)+arg1(np)+c-arg1(pp)': 1.0,
'arg1(np)+v(np)': 1.0,
'arg0(nnp)+r-arg0(wp)+v(v)+arg1(np)': 1.0,
'arg0(np)+v(v)+arg2(np)+arg1(to)': 1.0,
'arg1(whnp)+arg0(np)+v(v)+arg2(np)': 1.0,
'arg0(wp)+v(v)+arg2(np)': 1.0,
'arg1(wp)+arg0(np)+v(v)+arg2(np)+c-arg1(np)': 1.0,
'arg0(np)+v(v)+arg1(np)+arg2(sbar)': 1.0,
'v(v)+arg1(sbar s)': 1.0,
'v(v)+arg2(fw)': 1.0,
'v(v)+arg1(s vp)+arg0(pp)': 1.0,
'arg0(np)+r-arg0(wp)+v(v)+arg1(nns)': 1.0,
'arg0(dt)+r-arg0(wp)+v(v)+arg1(pp)': 1.0,
'arg0(prp)+v(v)+arg2(np)': 1.0,
'arg0(np)+v(v)+arg2(np)+arg1(vp)': 1.0,
'arg0(nns)+r-arg0(wdt)+v(v)+arg1(np)': 1.0,
'v(v)+arg2(np)+arg1(dt)': 1.0,
'arg0(np)+v(v)+c-arg0(np)': 1.0,
'v(v)+arg1(pp)': 1.0,
'arg2(dt)+v(v)+arg1(np)': 1.0,
'v(v)+arg2(np)+arg1(sbarq)': 1.0,
'arg1(wp)+v(v)+arg2(np)+c-arg1(pp)': 1.0,
'arg1(wp)+arg0(dt)+v(v)+arg2(np)': 1.0,
'arg1(nns)+r-arg1(wdt)+arg0(np)+v(v)': 1.0,
'arg2(dt)+arg0(np)+v(v)': 1.0,
'arg0(np)+v(v)+arg2(np)+arg1(nn)': 1.0,
'arg0(np)+v(v)+arg2(np)+arg1(nns)': 1.0,
'arg0(prp)+v(v)+arg2(np)+arg1(pp)': 1.0,
'arg1(dt)+r-arg1(wdt)+arg0(np)+v(v)+arg2(np)': 1.0,
'arg0(np)+v(v)+arg2(pp)+arg1(s)': 1.0,
'arg0(np)+v(v)+arg1(np)+arg2(np)': 1.0,
'arg0(nns)+r-arg0(wp)+v(v)+arg1(np)': 1.0,
'arg1(sbar)+arg0(np)+v(v)+arg2(np)': 1.0,
'v(v)+arg2(np)+arg1(in)': 1.0,
'arg1-dsp(np)+arg0(nnp)+v(v)+c-arg1-dsp(vp)': 1.0,
'v(v)+arg1(dt)+arg2(pp)+c-arg1(s)': 1.0,
'arg0(wdt)+v(v)+arg2(np)+arg1(pp)': 1.0,
'arg1(np)+r-arg1(wdt)+arg2(np)+v(v)': 1.0,
'arg0(np)+r-arg0(wp)+v(v)+arg2(np)+arg1(dt)': 1.0,
'arg0(np)+arg1(nn)+v(v)+arg2(np)': 1.0,
'arg1(dt)+arg0(np)+v(v)': 1.0,
'arg0(wp)+v(v)+arg2(np)+arg1(sbar)': 1.0,
'arg0(np)+v(v)+arg2(np)+arg1(sbarq)': 1.0,
'arg2(np)+v(v)+arg0(pp)': 1.0,
'arg1(np)+r-arg1(wdt)+arg0(np)+v(v)': 1.0,
'arg1(nns)+arg2(np)+v(v)': 1.0,
'arg2(np)+c-arg2(np)+v(v)+arg1(sbar)': 1.0,
'arg1(np)+r-arg1(whpp)+arg0(np)+v(v)': 1.0,
'v(v)+arg2(np)+arg1(:)': 1.0,
'arg0(wp)+v(v)+arg2(np)+arg1(dt)': 1.0,
'arg2(pp)+arg0(np)+v(v)+arg1(np)': 1.0,
'arg1(np)+arg0(np)+v(v)+c-arg1(pp)': 1.0,
'arg0(np)+r-arg0(wdt)+v(v)+arg1(np)': 1.0,
'arg0(nn)+v(v)+arg2(np)+arg1(dt)': 1.0,
'v(v)+arg1(dt)+arg2(pp)': 1.0,
'arg2(np)+v(v)+arg0(pp)+arg1(s vp)': 1.0,
'arg1(whnp)+arg0(np)+v(v)+arg2(np)+c-arg1(pp)': 1.0}
[11]:
propbank_agent.retrieve_cxn_schemata_for_roleset('tell.02')
[11]:
{'arg0(np)+v(v)+arg1(sbar)': 13.0,
'v(v)+arg1(sbar)': 11.0,
'arg0(np)+v(v)': 5.0,
'arg0(np)+v(v)+arg1(np)': 5.0,
'arg0(np)+v(v)+arg1(np)+arg2(pp)': 2.0,
'arg0(np)+v(v)+arg1(sbar s)': 1.0,
'arg1(wp)+arg0(np)+v(v)+arg2(pp)': 1.0,
'arg1(wp)+arg0(dt)+v(v)+arg2(np)': 1.0,
'arg1(np)+v(v)': 1.0,
'arg0(np)+v(v)+arg2(np)': 1.0,
'v(jj)+arg0(nn)': 1.0,
'arg0(np)+v(v)+arg2(pp)+arg1(sbar)': 1.0,
'v(v)+arg2(pp)': 1.0,
'arg1(sbar)+arg0(np)+v(v)+arg2(pp)': 1.0}
A different question to which the grammar network can provide a straightforward answer concerns the similarity of constructions in terms of their frequency of combination with other constructions. For example, the frame-evoking constructions in the network that are closest to the tell(verb)-cxn in terms of their co-occurrence with the argument structure constructions of the grammar are the ask(verb)-cxn, the remind(verb)-cxn and the teach(verb)-cxn, while the swim(verb)-cxn
and the consist(verb)-cxn are among the most distant ones. The similarity of constructions is computed in terms of their weighted cosine similarity, where two nodes with the exact same weighted links to all other nodes would be perfectly similar.
[12]:
propbank_agent.get_closest_categories('tell(v)', link_type="lex-gram")
[12]:
{'tell(v)': 1.0,
'ask(v)': 0.6255121,
'remind(v)': 0.38473272,
'teach(v)': 0.37172282,
'enable(v)': 0.3286541,
'require(v)': 0.25875446,
'permit(v)': 0.25699338,
'give(v)': 0.24487078,
'warn(v)': 0.22339688,
'trust(v)': 0.22114395,
'show(v)': 0.2209076,
'bid(v)': 0.1947823,
'serve(v)': 0.19148244,
'pay(v)': 0.19026785,
'question(v)': 0.1826872,
'feed(v)': 0.18126795,
'help(v)': 0.17693255,
'dispute(v)': 0.17144959,
'forgive(v)': 0.16765452,
'provide(v)': 0.16644527,
'write(v)': 0.16620483,
'win(v)': 0.16525294,
'guarantee(v)': 0.16460607,
'grant(v)': 0.15787788,
'send(v)': 0.1553325,
'discount(v)': 0.15291488,
'address(v)': 0.15039072,
'request(v)': 0.15023443,
'slap(v)': 0.14972283,
'owe(v)': 0.14402353,
'hear(v)': 0.14325829,
'caution(v)': 0.14183729,
'curse(v)': 0.1408344,
'play(v)': 0.1401356,
'militarize(v)': 0.14002373,
'present(v)': 0.13855645,
'declare(v)': 0.1381155,
'read(v)': 0.13752789,
'lobby(v)': 0.13741035,
'see(v)': 0.1373447,
'sell(v)': 0.13645628,
'blast(v)': 0.13592457,
'remember(v)': 0.13475849,
'provoke(v)': 0.13457223,
'lose(v)': 0.13411066,
'beat(v)': 0.13400452,
'obey(v)': 0.13357936,
'understand(v)': 0.13328536,
'follow(v)': 0.13324358,
'forbid(v)': 0.13317077,
'sign(v)': 0.13308543,
'leave(v)': 0.13301257,
'join(v)': 0.13290775,
'deny(v)': 0.13265772,
'find(v)': 0.13251019,
'take(v)': 0.13184984,
'worship(v)': 0.13166143,
'receive(v)': 0.13113153,
'trim(v)': 0.13101916,
'blame(v)': 0.13084582,
'enter(v)': 0.13050969,
'introduce(v)': 0.1301215,
'meet(v)': 0.12974042,
'respect(v)': 0.12972683,
'rinse(v)': 0.12952194,
'do(v)': 0.12941861,
'recognize(v)': 0.12841635,
'accept(v)': 0.12779175,
'demand(v)': 0.12775509,
'buy(v)': 0.12762971,
'produce(v)': 0.12735734,
'share(v)': 0.12711399,
'recommend(v)': 0.12690234,
'reach(v)': 0.12688449,
'harm(v)': 0.12684177,
'ignore(v)': 0.12661307,
'fax(v)': 0.12621567,
'visit(v)': 0.12614185,
'watch(v)': 0.12592833,
'raise(v)': 0.12573499,
'praise(v)': 0.12566065,
'acquire(v)': 0.1256542,
'attack(v)': 0.12553007,
'maintain(v)': 0.12550837,
'enjoy(v)': 0.12481521,
'get(v)': 0.12475878,
'bring(v)': 0.12445205,
'examine(v)': 0.124423206,
'adopt(v)': 0.124318846,
'learn(v)': 0.124175765,
'support(v)': 0.123789914,
'create(v)': 0.1237012,
'wear(v)': 0.1236859,
'draw(v)': 0.123618305,
'save(v)': 0.12331879,
'touch(v)': 0.123285175,
'hold(v)': 0.12292933,
'wash(v)': 0.12291669,
'form(v)': 0.1228429,
'call(v)': 0.12277459,
'abandon(v)': 0.12269186,
'conquer(v)': 0.12225867,
'break(v)': 0.12207036,
'achieve(v)': 0.12200624,
'express(v)': 0.121937,
'grace(v)': 0.12128936,
'photocopy(v)': 0.12128936,
'hamstring(v)': 0.12128936,
'propagandize(v)': 0.12128936,
'dent(v)': 0.12128936,
'ram(v)': 0.12128936,
'punish(v)': 0.12117506,
'catch(v)': 0.121171884,
'sing(v)': 0.12095919,
'commit(v)': 0.12080016,
'launch(v)': 0.12079612,
'face(v)': 0.12077648,
'embrace(v)': 0.120770455,
'email(v)': 0.120759875,
'crush(v)': 0.120691695,
'fool(v)': 0.1204755,
'fight(v)': 0.12028226,
'carry(v)': 0.12027377,
'negotiate(v)': 0.12026602,
'ease(v)': 0.120245144,
'report(v)': 0.12011913,
'retain(v)': 0.12007913,
'drink(v)': 0.11996957,
'judge(v)': 0.11990381,
'kill(v)': 0.119862914,
'earn(v)': 0.11969809,
'arrange(v)': 0.1196521,
'exceed(v)': 0.11959341,
'cut(v)': 0.119565286,
'answer(v)': 0.11947831,
'test(v)': 0.11943953,
'fire(v)': 0.11902211,
'ink(v)': 0.11902016,
'demilitarize(v)': 0.11902016,
'forget(v)': 0.11900471,
'overtake(v)': 0.11897896,
'blockade(v)': 0.11897896,
'miss(v)': 0.11888397,
'approve(v)': 0.11881238,
'disregard(v)': 0.11861621,
'gain(v)': 0.118614175,
'appreciate(v)': 0.11845133,
'offer(v)': 0.118398845,
'establish(v)': 0.118243404,
'dwarf(v)': 0.11819621,
'kiss(v)': 0.118117206,
'encounter(v)': 0.11800321,
'discuss(v)': 0.11795633,
'capture(v)': 0.117900446,
'experience(v)': 0.117856644,
'authorize(v)': 0.11770744,
'disrupt(v)': 0.11763384,
'perform(v)': 0.1174231,
'jeopardize(v)': 0.117413454,
'hurt(v)': 0.11720627,
'voice(v)': 0.11698146,
'lend(v)': 0.116832286,
'hit(v)': 0.116638504,
'arrest(v)': 0.11662185,
'admire(v)': 0.116519734,
'skip(v)': 0.116314136,
'formulate(v)': 0.11588326,
'exhaust(v)': 0.115719944,
'dismiss(v)': 0.115553394,
'identify(v)': 0.11547131,
'sue(v)': 0.115242765,
'eat(v)': 0.11516738,
'baptize(v)': 0.115158715,
'notice(v)': 0.11511737,
'handle(v)': 0.11509491,
'challenge(v)': 0.11504606,
'announce(v)': 0.11503767,
'defend(v)': 0.1150032,
'investigate(v)': 0.114916705,
'forecast(v)': 0.11485958,
'condemn(v)': 0.11478644,
'grab(v)': 0.11468903,
'destroy(v)': 0.11465007,
'nurse(v)': 0.114644416,
'exercise(v)': 0.11461156,
'reject(v)': 0.11443595,
'oppose(v)': 0.11440322,
'outstrip(v)': 0.114139885,
'seize(v)': 0.11406061,
'change(v)': 0.11392137,
'alter(v)': 0.113821924,
'issue(v)': 0.11367013,
'interview(v)': 0.11361641,
'mobilize(v)': 0.113607966,
'contradict(v)': 0.11359411,
'regain(v)': 0.11356676,
'criticize(v)': 0.113502875,
'attract(v)': 0.11341938,
'block(v)': 0.113331586,
'love(v)': 0.113249026,
'explore(v)': 0.11322978,
'suffer(v)': 0.112960555,
'generate(v)': 0.112933956,
'market(v)': 0.11284555,
'lower(v)': 0.11274787,
'reverse(v)': 0.1127204,
'evacuate(v)': 0.11271909,
'build(v)': 0.112654515,
'reap(v)': 0.11245757,
'pass(v)': 0.11242736,
'outpace(v)': 0.11240831,
'surpass(v)': 0.112212755,
'betray(v)': 0.112077296,
'dump(v)': 0.112051435,
'confess(v)': 0.11201897,
'insult(v)': 0.11198807,
'stone(v)': 0.111879654,
'stage(v)': 0.11170269,
'trigger(v)': 0.11169759,
'organize(v)': 0.1116803,
'terminate(v)': 0.11165232,
'release(v)': 0.11155165,
'deliver(v)': 0.11155092,
'bless(v)': 0.11150154,
'attend(v)': 0.11149091,
'purchase(v)': 0.111472294,
'check(v)': 0.111409344,
'reiterate(v)': 0.11131384,
'signal(v)': 0.11130763,
'celebrate(v)': 0.111232475,
'debate(v)': 0.11121669,
'spawn(v)': 0.11120651,
'squander(v)': 0.11115875,
'resist(v)': 0.11110452,
'invade(v)': 0.11095561,
'tighten(v)': 0.11091532,
'have(v)': 0.11089933,
'consume(v)': 0.11082247,
'anticipate(v)': 0.11071335,
'supply(v)': 0.110660434,
'lift(v)': 0.11041043,
'heal(v)': 0.110269606,
'dangle(v)': 0.11015054,
'eliminate(v)': 0.11010447,
'hate(v)': 0.110098645,
'repay(v)': 0.10991934,
'conduct(v)': 0.10991062,
'obtain(v)': 0.10989465,
'assemble(v)': 0.109781996,
'post(v)': 0.109632224,
'occupy(v)': 0.10951081,
'undermine(v)': 0.109407954,
'outweigh(v)': 0.109365754,
'air(v)': 0.109365635,
'uphold(v)': 0.10936201,
'clear(v)': 0.10917757,
'train(v)': 0.10913862,
'resume(v)': 0.10908456,
'treasure(v)': 0.10891289,
'tolerate(v)': 0.10875869,
'shun(v)': 0.108713255,
'rent(v)': 0.1086359,
'tap(v)': 0.1085708,
'dominate(v)': 0.10851838,
'swing(v)': 0.10851838,
'review(v)': 0.10847208,
'hug(v)': 0.1084619,
'suspend(v)': 0.10844558,
'pick(v)': 0.1083817,
'witness(v)': 0.108328156,
'mention(v)': 0.108281136,
'demonstrate(v)': 0.108214595,
'choose(v)': 0.108171426,
'obscure(v)': 0.1080878,
'reinvigorate(v)': 0.10802038,
'keep(v)': 0.1079195,
'escape(v)': 0.10785155,
'rub(v)': 0.107825354,
'embarrass(v)': 0.10779031,
'unveil(v)': 0.10771204,
'swallow(v)': 0.107690886,
'fetch(v)': 0.10767525,
'toss(v)': 0.10735152,
'observe(v)': 0.10731941,
'violate(v)': 0.107303955,
'brave(v)': 0.10717108,
'ruin(v)': 0.10703386,
'control(v)': 0.106917225,
'applaud(v)': 0.10676809,
'dislike(v)': 0.10676809,
'seek(v)': 0.106708944,
'trail(v)': 0.10662195,
'assess(v)': 0.10647424,
'describe(v)': 0.1064375,
'cancel(v)': 0.106339574,
'use(v)': 0.106049694,
'constitute(v)': 0.105993785,
'clinch(v)': 0.105978705,
'benefit(v)': 0.10586132,
'mark(v)': 0.10567023,
'defeat(v)': 0.105657876,
'repeat(v)': 0.10552439,
'recall(v)': 0.10542381,
'discover(v)': 0.1053045,
'fulfill(v)': 0.105231896,
'collect(v)': 0.105115295,
'delay(v)': 0.1050282,
'leak(v)': 0.10501779,
'analyze(v)': 0.10501385,
'pursue(v)': 0.10488942,
'detect(v)': 0.10488935,
'cement(v)': 0.10488935,
'hire(v)': 0.10484321,
'undertake(v)': 0.10476205,
'represent(v)': 0.10451811,
'startle(v)': 0.1043251,
'deploy(v)': 0.10423987,
'develop(v)': 0.104113884,
'run(v)': 0.10407388,
'track(v)': 0.10406135,
'spark(v)': 0.10401418,
'waive(v)': 0.10396231,
'comfort(v)': 0.10370506,
'open(v)': 0.103690326,
'employ(v)': 0.103618086,
'lag(v)': 0.10358891,
'discipline(v)': 0.103580266,
'affect(v)': 0.103383094,
'study(v)': 0.103370115,
'flee(v)': 0.10320357,
'greet(v)': 0.10309046,
'best(v)': 0.10307448,
'confiscate(v)': 0.10303024,
'light(v)': 0.10299687,
'crack(v)': 0.102904245,
'justify(v)': 0.10290169,
'charge(v)': 0.10277614,
'hide(v)': 0.10266754,
'regret(v)': 0.1024964,
'congratulate(v)': 0.10240552,
'intercept(v)': 0.10224706,
'cite(v)': 0.102227,
'complicate(v)': 0.10213496,
'tour(v)': 0.10210583,
'risk(v)': 0.10208539,
'drive(v)': 0.10206548,
'exaggerate(v)': 0.10206395,
'tease(v)': 0.10206395,
'submit(v)': 0.10186192,
'target(v)': 0.10183784,
'possess(v)': 0.10175833,
'welcome(v)': 0.10175774,
'cross(v)': 0.1015426,
'impose(v)': 0.10146494,
'diagnose(v)': 0.10140683,
'reward(v)': 0.10132812,
'prefer(v)': 0.101313465,
'predict(v)': 0.10124724,
'bite(v)': 0.100928895,
'shoot(v)': 0.10086819,
'annoy(v)': 0.10086819,
'ban(v)': 0.100775615,
'boost(v)': 0.100715235,
'wave(v)': 0.100555584,
'lack(v)': 0.100555,
'ride(v)': 0.10018418,
'shake(v)': 0.10012631,
'dismantle(v)': 0.09988906,
'secure(v)': 0.09988818,
'unload(v)': 0.09977674,
'determine(v)': 0.099701345,
'update(v)': 0.09963054,
'miscalculate(v)': 0.09962862,
'cast(v)': 0.09932812,
'despise(v)': 0.09923248,
'relieve(v)': 0.0992162,
'storm(v)': 0.09906327,
'hitch(v)': 0.09903234,
'monopolize(v)': 0.09903234,
'befriend(v)': 0.09903234,
'restructure(v)': 0.099032335,
'upset(v)': 0.09901172,
'succeed(v)': 0.09895107,
'carve(v)': 0.09884804,
'absorb(v)': 0.098702304,
'initiate(v)': 0.098652154,
'reopen(v)': 0.0986273,
'invent(v)': 0.09858164,
'cause(v)': 0.0984169,
'evaluate(v)': 0.09830324,
'jolt(v)': 0.098235145,
'advocate(v)': 0.09813229,
'dig(v)': 0.098016605,
'activate(v)': 0.098016605,
'spell(v)': 0.09790938,
'operate(v)': 0.09789381,
'shed(v)': 0.09778503,
'finish(v)': 0.09739929,
'cheer(v)': 0.09725847,
'abuse(v)': 0.09708898,
'breed(v)': 0.09703324,
'complete(v)': 0.09702037,
'burn(v)': 0.09680734,
'enforce(v)': 0.09679971,
'restrict(v)': 0.09663863,
'squirm(v)': 0.09653643,
'ratify(v)': 0.09646499,
'rattle(v)': 0.096307665,
'reshape(v)': 0.09626631,
'print(v)': 0.095917605,
'encourage(v)': 0.09578677,
'emphasize(v)': 0.095725134,
'depress(v)': 0.095711336,
'book(v)': 0.095705055,
'uproot(v)': 0.095496275,
'cherish(v)': 0.095496275,
'institute(v)': 0.095496275,
'blow(v)': 0.09547208,
'favor(v)': 0.095427476,
'overwhelm(v)': 0.09529878,
'precipitate(v)': 0.09526318,
'practice(v)': 0.09524932,
'select(v)': 0.09516064,
'invoke(v)': 0.09509008,
'hinder(v)': 0.09509008,
'nail(v)': 0.09505889,
'amplify(v)': 0.09499216,
'stimulate(v)': 0.09490639,
'tear_down(v)': 0.09474014,
'interpret(v)': 0.094726145,
'push(v)': 0.09472243,
'reduce(v)': 0.094707884,
'resolve(v)': 0.09469555,
'deprive(v)': 0.09467996,
'order(v)': 0.09465839,
'chortle(v)': 0.09451601,
'tenure(v)': 0.09451601,
'misread(v)': 0.09451601,
'laminate(v)': 0.09451601,
'tickle(v)': 0.09451601,
'blunt(v)': 0.09451601,
'impugn(v)': 0.09451601,
'fondle(v)': 0.09451601,
'hose(v)': 0.09451601,
'preview(v)': 0.09451601,
'massacre(v)': 0.09451601,
'ingest(v)': 0.09451601,
'rework(v)': 0.09451601,
'rehire(v)': 0.09451601,
'enrage(v)': 0.09451601,
'overrun(v)': 0.09451601,
'wet(v)': 0.09451601,
'chuck(v)': 0.09451601,
'overplay(v)': 0.09451601,
'hoe(v)': 0.09451601,
'hemorrhage(v)': 0.09451601,
'console(v)': 0.09451601,
'rebuke(v)': 0.09451601,
'buff(v)': 0.09451601,
'evict(v)': 0.09451601,
'squeegee(v)': 0.09451601,
'segregate(v)': 0.09451601,
'comb(v)': 0.09451601,
'repudiate(v)': 0.09451601,
'muddy(v)': 0.09451601,
'subcontract(v)': 0.09451601,
'beguile(v)': 0.09451601,
'romance(v)': 0.09451601,
'reuse(v)': 0.09451601,
'dung(v)': 0.09451601,
'effectuate(v)': 0.09451601,
'jingle(v)': 0.09451601,
'libel(v)': 0.09451601,
'reshuffle(v)': 0.09451601,
'feign(v)': 0.09451601,
'stomach(v)': 0.09451601,
'traipse(v)': 0.09451601,
'relive(v)': 0.09451601,
'encompass(v)': 0.09451601,
'incise(v)': 0.09451601,
'repackage(v)': 0.09451601,
'liquidize(v)': 0.09451601,
'brook(v)': 0.09451601,
'spearhead(v)': 0.09451601,
'elude(v)': 0.09451601,
'blindfold(v)': 0.09451601,
'junk(v)': 0.09451601,
'obviate(v)': 0.09451601,
'redouble(v)': 0.09451601,
'hem(v)': 0.09451601,
'disadvantage(v)': 0.09451601,
'mince(v)': 0.09451601,
'recant(v)': 0.09451601,
'demean(v)': 0.09451601,
'butter(v)': 0.09451601,
'shadow(v)': 0.09451601,
'pet(v)': 0.09451601,
'skew(v)': 0.09451601,
'copy(v)': 0.094516,
'court(v)': 0.09449285,
'derail(v)': 0.09406113,
'captivate(v)': 0.09406113,
'master(v)': 0.09405362,
'press(v)': 0.09398268,
'neglect(v)': 0.09396165,
'parallel(v)': 0.09394009,
'quit(v)': 0.093805485,
'set(v)': 0.093591005,
'rape(v)': 0.093515836,
'broadcast(v)': 0.093450435,
'direct(v)': 0.09344945,
'ascend(v)': 0.093349144,
'incorporate(v)': 0.093024686,
'flush(v)': 0.09296914,
'denounce(v)': 0.09289818,
'comprehend(v)': 0.092892215,
'rediscover(v)': 0.092892215,
'recount(v)': 0.092892215,
'recoup(v)': 0.092881225,
'telephone(v)': 0.09282348,
'prescribe(v)': 0.09262171,
'disappoint(v)': 0.09261698,
'reestablish(v)': 0.09261698,
'survive(v)': 0.09245232,
'impact(v)': 0.092430935,
'cultivate(v)': 0.09239243,
'cede(v)': 0.09236525,
'exacerbate(v)': 0.09218228,
'explain(v)': 0.092169106,
'confront(v)': 0.09210479,
'overcome(v)': 0.0920563,
'regulate(v)': 0.09204936,
'guide(v)': 0.092005044,
'hang(v)': 0.091633305,
'implement(v)': 0.09147218,
'pitch(v)': 0.09129389,
'confirm(v)': 0.09126704,
'treat(v)': 0.09125096,
'gather(v)': 0.09125073,
'foresee(v)': 0.09124051,
'improve(v)': 0.09104833,
'mutilate(v)': 0.09101541,
'usurp(v)': 0.090948075,
'convey(v)': 0.090836846,
'revolutionize(v)': 0.090799734,
'relinquish(v)': 0.09077275,
'drain(v)': 0.09075474,
'shut(v)': 0.0906835,
'exclude(v)': 0.090626456,
'count(v)': 0.09041187,
'surprise(v)': 0.09040375,
'sacrifice(v)': 0.09034819,
'defy(v)': 0.09029275,
'liberalize(v)': 0.09029275,
'surrender(v)': 0.09021925,
'contest(v)': 0.08993448,
'solicit(v)': 0.08981499,
'reveal(v)': 0.08977896,
'assume(v)': 0.08974168,
'offend(v)': 0.08972937,
'reclaim(v)': 0.08971482,
'seal(v)': 0.089703135,
'attain(v)': 0.08968435,
'define(v)': 0.089505196,
'turn(v)': 0.08946821,
'deserve(v)': 0.08937833,
'cover(v)': 0.08934723,
'distort(v)': 0.08930924,
'solidify(v)': 0.089265116,
'unlock(v)': 0.089265116,
'pronounce(v)': 0.089234226,
'reinforce(v)': 0.089212954,
'scrap(v)': 0.08915578,
'feature(v)': 0.08910607,
'threaten(v)': 0.08899585,
'repeal(v)': 0.088927,
'blackmail(v)': 0.088927,
'execute(v)': 0.08890428,
'extend(v)': 0.088882506,
'own(v)': 0.08885567,
'display(v)': 0.08882754,
'shout(v)': 0.08861147,
'refine(v)': 0.08860488,
'scan(v)': 0.08860488,
'strip(v)': 0.08858513,
'scrutinize(v)': 0.08855878,
'exhibit(v)': 0.08838997,
'mourn(v)': 0.088297926,
'vote(v)': 0.08826369,
'affirm(v)': 0.08819827,
'tape(v)': 0.08800533,
'solve(v)': 0.08784311,
'publish(v)': 0.087680236,
'broaden(v)': 0.08767337,
'butcher(v)': 0.08766871,
'police(v)': 0.08766871,
'halve(v)': 0.08766871,
'boo(v)': 0.08766871,
'stifle(v)': 0.08760388,
'highlight(v)': 0.087472126,
'isolate(v)': 0.0874112,
'remedy(v)': 0.08732458,
'endorse(v)': 0.08732367,
'bully(v)': 0.08731014,
'endanger(v)': 0.08718839,
'redo(v)': 0.08717577,
'research(v)': 0.08714063,
'shave(v)': 0.08690594,
'strengthen(v)': 0.08676723,
'export(v)': 0.08671516,
'protest(v)': 0.08663967,
'forsake(v)': 0.08654841,
'discourage(v)': 0.086424515,
'uncover(v)': 0.08637872,
'kick(v)': 0.08620641,
'fear(v)': 0.08615478,
'reaffirm(v)': 0.086001486,
'ridicule(v)': 0.086001486,
'milk(v)': 0.086001486,
'acknowledge(v)': 0.0857829,
'pioneer(v)': 0.08576453,
'knit(v)': 0.08576453,
'adore(v)': 0.08576453,
'exert(v)': 0.08576316,
'aggravate(v)': 0.08574666,
'limit(v)': 0.08562346,
'process(v)': 0.08557161,
'pump_out(v)': 0.08547847,
'kidnap(v)': 0.08528532,
'cheat(v)': 0.085281685,
'twist(v)': 0.08520597,
'weaken(v)': 0.085175015,
'boycott(v)': 0.084889375,
'conclude(v)': 0.08481308,
'strike(v)': 0.08480219,
'realize(v)': 0.08463189,
'downplay(v)': 0.084537685,
'scatter(v)': 0.084455066,
'promulgate(v)': 0.08431755,
'monitor(v)': 0.08421774,
'lick(v)': 0.08415996,
'excise(v)': 0.08415996,
'fracture(v)': 0.08415996,
'bust(v)': 0.08415996,
'popularize(v)': 0.08415996,
'excite(v)': 0.08415996,
'lavish(v)': 0.08415996,
'transcribe(v)': 0.08415996,
'drop(v)': 0.08415112,
'undo(v)': 0.08413084,
'settle(v)': 0.08406136,
'fold(v)': 0.08405797,
'compensate(v)': 0.08404493,
'visualize(v)': 0.08401423,
'donate(v)': 0.08367635,
'spot(v)': 0.0834482,
'found(v)': 0.08344216,
'recite(v)': 0.083355285,
'project(v)': 0.08326598,
'illustrate(v)': 0.083136596,
'demolish(v)': 0.08297218,
'sidestep(v)': 0.08288844,
'stuff(v)': 0.08288844,
'mortgage(v)': 0.08288844,
'tackle(v)': 0.082875,
'frame(v)': 0.082847364,
'freeze(v)': 0.082701504,
'untie(v)': 0.08269373,
'reconsider(v)': 0.08261399,
'rehearse(v)': 0.0823548,
'shock(v)': 0.0823548,
'refurbish(v)': 0.08226393,
'echo(v)': 0.082250275,
'starve(v)': 0.08210329,
'google(v)': 0.08209618,
'allay(v)': 0.08203219,
'free(v)': 0.08196514,
'restore(v)': 0.08191895,
'incur(v)': 0.081905514,
'broach(v)': 0.08168467,
'brush(v)': 0.08168467,
'energize(v)': 0.08168467,
'down(v)': 0.08168467,
'discard(v)': 0.08164716,
'resurrect(v)': 0.08140666,
'tempt(v)': 0.081254646,
'sustain(v)': 0.08098296,
'wrap(v)': 0.080951214,
'exterminate(v)': 0.08084273,
'remove(v)': 0.080823064,
'underestimate(v)': 0.08074477,
'frighten(v)': 0.08058385,
'rewrite(v)': 0.08045926,
'haul(v)': 0.08045926,
'auction(v)': 0.08044647,
'sense(v)': 0.08043851,
'manage(v)': 0.08042365,
'contain(v)': 0.08019453,
'prove(v)': 0.08010403,
'eye(v)': 0.08003022,
'skirt(v)': 0.08003022,
'reassess(v)': 0.08003022,
'contact(v)': 0.07981486,
'expand(v)': 0.07942401,
'divest(v)': 0.07938599,
'imagine(v)': 0.07936946,
'scratch(v)': 0.079346776,
'shape(v)': 0.07932693,
'nominate(v)': 0.07928889,
'pack(v)': 0.07916013,
'smash(v)': 0.07916013,
'halt(v)': 0.07915121,
'enact(v)': 0.07907777,
'certify(v)': 0.079058394,
'steal(v)': 0.07901219,
'protect(v)': 0.078946196,
'bury(v)': 0.07890884,
'murder(v)': 0.078876466,
'evoke(v)': 0.0788318,
'divorce(v)': 0.07882167,
'row(v)': 0.07882166,
'modulate(v)': 0.07882166,
'detonate(v)': 0.07882166,
'club(v)': 0.07882166,
'know(v)': 0.07880116,
'pedal(v)': 0.07876334,
'cushion(v)': 0.07864207,
'command(v)': 0.07863488,
'belittle(v)': 0.07860111,
'distinguish(v)': 0.07860111,
'shell(v)': 0.07860111,
'admit(v)': 0.07842651,
'commission(v)': 0.07838428,
'befall(v)': 0.078275636,
'waste(v)': 0.07826325,
'thwart(v)': 0.07814798,
'portray(v)': 0.07810187,
'advise(v)': 0.07810187,
'imply(v)': 0.07804383,
'underscore(v)': 0.07793126,
'notify(v)': 0.07792033,
'haunt(v)': 0.07781113,
'promote(v)': 0.07765449,
'wield(v)': 0.07744473,
'penetrate(v)': 0.077352904,
'integrate(v)': 0.07723048,
'downgrade(v)': 0.077171996,
'don(v)': 0.077171996,
'chill(v)': 0.077171996,
'modify(v)': 0.07709674,
'file(v)': 0.07708267,
'split(v)': 0.077052504,
'phone(v)': 0.077049196,
'underperform(v)': 0.077013046,
'coextrude(v)': 0.077013046,
'reapportion(v)': 0.077013046,
'fling(v)': 0.077013046,
'sequester(v)': 0.077013046,
'emboss(v)': 0.077013046,
'scotch(v)': 0.077013046,
'convoke(v)': 0.077013046,
'unnerve(v)': 0.077013046,
'destigmatize(v)': 0.077013046,
'berate(v)': 0.077013046,
'outleap(v)': 0.077013046,
'mash(v)': 0.077013046,
'dishonor(v)': 0.077013046,
'bruise(v)': 0.077013046,
'refute(v)': 0.077013046,
'parry(v)': 0.077013046,
'secrete(v)': 0.077013046,
'illuminate(v)': 0.077013046,
'shag(v)': 0.077013046,
'skin(v)': 0.077013046,
'pry(v)': 0.077013046,
'deactivate(v)': 0.077013046,
'sugarcoat(v)': 0.077013046,
'disavow(v)': 0.077013046,
'shampoo(v)': 0.077013046,
'beget(v)': 0.077013046,
'coo(v)': 0.077013046,
'shutter(v)': 0.077013046,
'decry(v)': 0.077013046,
'bumble(v)': 0.077013046,
'swipe(v)': 0.077013046,
'outshine(v)': 0.077013046,
'disassemble(v)': 0.077013046,
'slit(v)': 0.077013046,
'ensnare(v)': 0.077013046,
'engender(v)': 0.077013046,
'desecrate(v)': 0.077013046,
'devour(v)': 0.077013046,
'abhor(v)': 0.077013046,
'decapitate(v)': 0.077013046,
'deplore(v)': 0.077013046,
'verbalize(v)': 0.077013046,
'astound(v)': 0.077013046,
'typify(v)': 0.077013046,
'option(v)': 0.077013046,
'floor(v)': 0.077013046,
'spice(v)': 0.077013046,
'nickel(v)': 0.077013046,
'concoct(v)': 0.077013046,
'recruit(v)': 0.07679446,
'devise(v)': 0.07678125,
'revive(v)': 0.076773174,
'tout(v)': 0.07673409,
'cart(v)': 0.07673408,
'supersede(v)': 0.07671012,
'spurn(v)': 0.07671012,
'record(v)': 0.07670176,
'disparage(v)': 0.07670029,
'back(v)': 0.076469936,
'repair(v)': 0.07616385,
'anger(v)': 0.07613789,
'top(v)': 0.07611526,
'deter(v)': 0.07611526,
'scuttle(v)': 0.0757294,
'revamp(v)': 0.0757294,
'chase(v)': 0.07571136,
'claim(v)': 0.075532265,
'transcend(v)': 0.075496435,
'whip(v)': 0.075496435,
'relay(v)': 0.075416684,
'warrant(v)': 0.075416684,
'recover(v)': 0.07538148,
'depict(v)': 0.07531361,
'intimidate(v)': 0.07527496,
'construct(v)': 0.07494726,
'avoid(v)': 0.07493459,
'reassign(v)': 0.074779526,
'envy(v)': 0.07471363,
'lay(v)': 0.07437402,
'postpone(v)': 0.07434472,
'hijack(v)': 0.074313775,
'paste(v)': 0.07425879,
'cease(v)': 0.074105844,
'erase(v)': 0.07409359,
'foil(v)': 0.07409359,
'utilize(v)': 0.07396221,
'roam(v)': 0.07391025,
'elevate(v)': 0.073882796,
'tie(v)': 0.073881276,
'compound(v)': 0.07378762,
'disturb(v)': 0.07376899,
'fix_up(v)': 0.07373368,
'persecute(v)': 0.07368273,
'rebuild(v)': 0.0735921,
'curb(v)': 0.073579096,
'interrogate(v)': 0.07351245,
'withdraw(v)': 0.07351245,
'recreate(v)': 0.07351245,
'foster(v)': 0.07346084,
'register(v)': 0.073456064,
'measure(v)': 0.07338776,
'host(v)': 0.07324231,
'expose(v)': 0.07321055,
'mishandle(v)': 0.07306099,
'crowd(v)': 0.07299241,
'stress(v)': 0.07298527,
'leverage(v)': 0.07288466,
'exempt(v)': 0.07288466,
'mimic(v)': 0.072770484,
'specify(v)': 0.07269106,
'stop(v)': 0.07250169,
'clean(v)': 0.0723939,
'exchange(v)': 0.07220252,
'melt(v)': 0.07218839,
'hamper(v)': 0.07201359,
'believe(v)': 0.07200386,
'eclipse(v)': 0.0717835,
'overstay(v)': 0.0717835,
'slam(v)': 0.0717835,
'requisition(v)': 0.0717835,
'inhibit(v)': 0.0717835,
'scout(v)': 0.0717835,
'slay(v)': 0.0717835,
'install(v)': 0.07174191,
'dry(v)': 0.07153256,
'wreck(v)': 0.07144739,
'rejoin(v)': 0.07144739,
'reintroduce(v)': 0.07144739,
'condition(v)': 0.07144739,
'chant(v)': 0.07144739,
'roil(v)': 0.07144739,
'buttress(v)': 0.07144739,
'double(v)': 0.07131755,
'enlist(v)': 0.07117872,
'harass(v)': 0.07117872,
'photograph(v)': 0.07117872,
'honor(v)': 0.071125806,
'shorten(v)': 0.071103536,
'widen(v)': 0.071073555,
'battle(v)': 0.070901245,
'chair(v)': 0.070854776,
'utter(v)': 0.07084702,
'slash(v)': 0.07078575,
'rest(v)': 0.0707404,
'pose(v)': 0.07059809,
'commercialize(v)': 0.07054585,
'plant(v)': 0.07052846,
'disseminate(v)': 0.07044807,
'spend(v)': 0.07040865,
'separate(v)': 0.070171654,
'start(v)': 0.07013512,
'make(v)': 0.07011076,
'repatriate(v)': 0.07001186,
'curtail(v)': 0.07001186,
'shackle(v)': 0.07001186,
'navigate(v)': 0.07001186,
'influence(v)': 0.07000109,
'dispatch(v)': 0.069932446,
'scare(v)': 0.06980326,
'add(v)': 0.069798805,
'unify(v)': 0.069740035,
'reinstate(v)': 0.06966092,
'consult(v)': 0.069634766,
'view(v)': 0.06953756,
'inherit(v)': 0.06946274,
'prevent(v)': 0.06938901,
'revise(v)': 0.0693388,
'crucify(v)': 0.06930821,
'outrank(v)': 0.06930821,
'degrade(v)': 0.06930821,
'net(v)': 0.06930821,
'outsell(v)': 0.06930821,
'loathe(v)': 0.06930821,
'mismanage(v)': 0.06930821,
'hoard(v)': 0.06930821,
'exalt(v)': 0.06930821,
'render(v)': 0.06913671,
'disclose(v)': 0.069121056,
'increase(v)': 0.069077194,
'correct(v)': 0.0690737,
'promise(v)': 0.06898892,
'craft(v)': 0.068933174,
'revoke(v)': 0.068933174,
'wound(v)': 0.06891861,
'invigorate(v)': 0.06888256,
'bombard(v)': 0.06888256,
'vent(v)': 0.06888256,
'combine(v)': 0.06883316,
'revalue(v)': 0.06871632,
'reassert(v)': 0.06871632,
'string(v)': 0.06871632,
'renounce(v)': 0.06871632,
'remodel(v)': 0.06871632,
'herald(v)': 0.06871632,
'renew(v)': 0.068622396,
'calculate(v)': 0.068487495,
'like(v)': 0.06843005,
'purge(v)': 0.068261564,
'prioritize(v)': 0.06807055,
'prohibit(v)': 0.06789021,
'lure(v)': 0.06788277,
'concede(v)': 0.06780281,
'frequent(v)': 0.06767813,
'need(v)': 0.06747856,
'fuck(v)': 0.06747809,
'author(v)': 0.06747809,
'replace(v)': 0.067448504,
'fly(v)': 0.06740709,
'deposit(v)': 0.06736124,
'weigh(v)': 0.06721934,
'draft(v)': 0.067193,
'prompt(v)': 0.06715888,
'coordinate(v)': 0.066890426,
'tail(v)': 0.06683291,
'finesse(v)': 0.06683291,
'bemoan(v)': 0.06683291,
'whipsaw(v)': 0.06683291,
'flex(v)': 0.06683291,
'lash(v)': 0.06683291,
'remake(v)': 0.06683291,
'redraw(v)': 0.06683291,
'enshrine(v)': 0.06683291,
'source(v)': 0.06683291,
'overhang(v)': 0.06683291,
'barricade(v)': 0.06683291,
'ravage(v)': 0.06683291,
'sour(v)': 0.06683291,
'shelve(v)': 0.06683291,
'ogle(v)': 0.06683291,
'torpedo(v)': 0.06683291,
'decree(v)': 0.06683291,
'mint(v)': 0.06683291,
'repress(v)': 0.06683291,
'reimpose(v)': 0.06683291,
'rig(v)': 0.06683291,
'pain(v)': 0.06683291,
'shirk(v)': 0.06683291,
'stalk(v)': 0.06683291,
'hike(v)': 0.06683291,
'enumerate(v)': 0.06683291,
'stock(v)': 0.06683291,
'rake(v)': 0.06683291,
'distribute(v)': 0.06678329,
'encode(v)': 0.06669525,
'mistreat(v)': 0.06669525,
'mute(v)': 0.06669525,
...}
Using the grammar to comprehend new utterances
Our agent can now use its pretrained grammar to comprehend new utterances. Below, we instruct our agent to comprehend the passive utterance “Margaret Thatcher was elected Prime Minister of Britain.”. The resulting meaning representation reveals that the agent identified a single semantic frame that instantiates the elect.01 PropBank roleset (elect someone to an office or position). The agent also understood that the roles of candidate (arg1) and office or position (arg2) in
this instance of elect.01 are respectively taken up by “Margaret Thatcher” and “Prime Minister of Britain”.
[13]:
propbank_agent.comprehend("Margaret Thatcher was elected Prime Minister of Britain.")
[13]:
[{'roleset': 'be.01',
'roles': [('v', 'was', [2]),
('arg1', 'Margaret Thatcher', [0, 1]),
('arg2', 'elected Prime Minister of Britain', [3, 4, 5, 6, 7])]},
{'roleset': 'elect.01',
'roles': [('v', 'elected', [3]),
('arg1', 'Margaret Thatcher', [0, 1]),
('arg2', 'Prime Minister of Britain', [4, 5, 6, 7])]}]
To enhance human readability, we can choose to activate an FCG monitor to trace the comprehension process in the web interface:
[14]:
fcg.start_web_interface()
fcg.activate_monitor('trace-fcg')
propbank_agent.comprehend("She especially enjoyed visiting the old historic churches.")
[14]:
[{'roleset': 'enjoy.01',
'roles': [('v', 'enjoyed', [2]),
('arg0', 'She', [0]),
('arg1', 'visiting the old historic churches', [3, 4, 5, 6, 7])]},
{'roleset': 'visit.01',
'roles': [('v', 'visiting', [3]),
('arg0', 'She', [0]),
('arg1', 'the old historic churches', [4, 5, 6, 7])]}]
In order to better understand the PropBank rolesets that are retrieved by our agent, we define a new function describe_roleset. The function makes use of nltk’s propbank module to look up all roles of a given roleset, together with their descriptions:
[15]:
import nltk
nltk.download('propbank')
from nltk.corpus import propbank
[nltk_data] Downloading package propbank to
[nltk_data] /Users/katrien/nltk_data...
[nltk_data] Package propbank is already up-to-date!
[16]:
def describe_roleset(roleset):
nltk_roleset = propbank.roleset(roleset)
print(nltk_roleset.attrib['id'])
for role in nltk_roleset.findall("roles/role"):
print(' arg' + role.attrib['n'] + ':', role.attrib['descr'])
describe_roleset('elect.01')
elect.01
arg0: voters
arg1: candidate
arg2: office or position
[17]:
display(propbank_agent.comprehend("Attention passengers, the taxi is arriving at Gate 1."))
describe_roleset('arrive.01')
[{'roleset': 'arrive.01',
'roles': [('v', 'arriving', [6]),
('arg1', 'the taxi', [3, 4]),
('arg4', 'at Gate 1', [7, 8, 9])]},
{'roleset': 'be.01',
'roles': [('v', 'is', [5]),
('arg1', 'the taxi', [3, 4]),
('arg2', 'arriving at Gate 1', [6, 7, 8, 9])]}]
arrive.01
arg1: entity in motion / 'comer'
arg2: extent -- rare)
arg3: start point -- also rare)
arg4: end point, destination
[18]:
display(propbank_agent.comprehend("Explain this to me again."))
describe_roleset('explain.01')
[{'roleset': 'explain.01',
'roles': [('v', 'Explain', [0]),
('arg1', 'this', [1]),
('arg2', 'to me', [2, 3])]}]
explain.01
arg0: explainer
arg1: thing explained
arg2: explained to
[19]:
display(propbank_agent.comprehend("They enjoy visiting New York."))
describe_roleset('visit.01')
describe_roleset('enjoy.01')
[{'roleset': 'enjoy.01',
'roles': [('v', 'enjoy', [1]),
('arg0', 'They', [0]),
('arg1', 'visiting New York', [2, 3, 4])]},
{'roleset': 'visit.01',
'roles': [('v', 'visiting', [2]),
('arg0', 'They', [0]),
('arg1', 'New York', [3, 4])]}]
visit.01
arg0: one party
arg1: other party
enjoy.01
arg0: enjoyer
arg1: thing enjoyed
Training a new grammar
Let us now create a second agent, again as an instance of the fcg.PropBankAgent class, but let it learn a new grammar from corpus data instead of loading a pretrained one. After having downloaded an example CoNNL file, in which a number of English sentences are annotated with PropBank rolesets, we can inspect the first sentence of the file:
[20]:
propbank_agent_learner = fcg.PropBankAgent()
conll_annotations = fcg.load_resource('pb-annotations.conll')
with open(conll_annotations, 'r') as f:
sentence_end = False
while sentence_end == False:
line = f.readline()
if not '/' in line:
sentence_end = True
else:
print(line)
/ 0 0 I / / - - (ARG0*)
/ 0 1 gave / / give give.01 (V*)
/ 0 2 flowers / / - - (ARG1*)
/ 0 3 to / / - - (ARG2*
/ 0 4 my / / - - *
/ 0 5 mother / / - - *)
We now call the agent’s learn_grammar_from_conll_file method. This call initiates the learning process implemented by the fcg-propbank subsystem and equips the agent with the resulting grammar. In this case, the agent has learnt two lexical constructions (for verbs with the lemmas give and send), two word sense constructions (for the rolesets give.01 and send.01), and two argument structure constructions (a double object construction and a prepositional dative construction).
[21]:
propbank_agent_learner.learn_grammar_from_conll_file(conll_annotations)
propbank_agent_learner
[21]:
<Agent: agent (id: agent-26) ~ 6 constructions>
We can inspect the agent’s grammar in the web interface.
[22]:
propbank_agent_learner.grammar.show_in_web_interface()
We can now instruct our agent to comprehend a previously unseen utterance, using the grammar it just learnt, by calling its comprehend method. While comprehending “The King of the Belgians sent a box of chocolates to Forrest Gump.”, the agent identifies an instance of the send.01 (give) roleset, with “The King of the Belgians” as the sender (arg0), “a box of chocolates” as the thing sent (arg1) and “to Forrest Gump” as the sent-to entity (arg2).
[23]:
display(propbank_agent_learner.comprehend('The King of the Belgians sent a box of chocolates to Forrest Gump.'))
describe_roleset('send.01')
[{'roleset': 'send.01',
'roles': [('v', 'sent', [5]),
('arg0', 'The King of the Belgians', [0, 1, 2, 3, 4]),
('arg1', 'a box of chocolates', [6, 7, 8, 9]),
('arg2', 'to Forrest Gump', [10, 11, 12])]}]
send.01
arg0: sender
arg1: sent
arg2: sent-to
After learning a grammar, it can be saved by calling the save_grammar_image method of the fcg.Agent class. This method saves the grammar to a file in a compiled, binary format that can later efficiently be loaded using an agent’s load_grammar_image method.
[24]:
propbank_agent_learner.save_grammar_image('usage-based-grammar.store')
[24]:
'/Users/katrien/Projects/pyfcg/docs/source/walkthrough_tutorials/usage-based-grammar.store'
[25]:
new_agent = fcg.PropBankAgent()
new_agent.load_grammar_image('usage-based-grammar.store')
new_agent
[25]:
<Agent: agent (id: agent-27) ~ 6 constructions>