Abstract
Within the field of Computational Lexical Semantics, and based on the assumption that the performance of meaning determination computational processes is largely assisted by structured and extensive lexica, providing different types of information, this dissertation presents the analysis of Portuguese verbs of movement in order to determine the semantic and syntactic properties of these lexical items and how this information can be related to the computation and prediction of the structures in which they occur.
The restriction to a specific semantic domain allowed a more accurate determination of the meaning of each verb, through the establishment of lexical-conceptual relations within a relational model of the Lexicon. The lexical semantic analysis of these verbs is based on the meaning specificities that differentiate hyponym verbs from their hyperonyms and sister nodes. The identification of the meaning components shared and those not shared by verbs of the same semantic domain motivates the determination of the relevant semantic information to be stated at the lexical entry level, as well as the structure of this information.
This work puts forth a proposal for a Portuguese wordnet of verbs of movement, referring the different levels of analysis that are relevant for a coherent encoding of the verbs of this class: the way lexical items are grouped in concept denoting sets and the relations established between these sets contemplate the conceptual and semantic properties of the lexical items, and the resulting organization of the lexicon allows for the determining the information that is shared.
The development of a wordnet for Portuguese verbs of movement required the definition of the top nodes of the net as well as of some other coding options, allowing testing conceptual inheritance from the higher to the lower nodes in the hierarchy. The resulting network revealed the semantic and syntactic diversity of verbs directly related, namely that semantic properties such as argument structure or Aktionsart properties are directly related to the meaning specificities of the concepts denoted, but are not straightforwardly inherited or conditioned by the semantic domain to which a given verb belongs.
Based on the developed wordnet, a decompositional analysis of the meaning of the Portuguese verbs of movement is presented, focusing on the meaning specificities that differentiate each hyponym concept with regard to its hyperonym. This analysis revealed semantic incorporation patterns different from those considered to work for Romance languages and resulted in the proposal of a new set of semantic components, comprising the elements lexicalized by the verbs in study, and extendable to the analysis of verbs from other semantic domains. The semantic content specific to each hyponym differentiates co-hyponym verbs and explains co-hyponyms compatibility: co-hyponyms lexicalizing opposite or otherwise incompatible values for the same semantic element are incompatible (i.e., do not co-occur). The lexicalization of the semantic components considered affects the inheritance of the hyperonym properties at different degrees, namely in what concerns argument structure (argument number, subcategorization properties and semantic restrictions on the type of the arguments selected) and Aktionsart properties.
The following salient patterns of lexicalization were observed: the incorporation of restrictions on the semantic components SOURCE (initial location or position) and GOAL (final location or position) results in an increase of the number of overt arguments of the hyponyms, whereas the lexicalization of these components results in a decrease of the number of overt arguments of the hyponyms, with respect to the hyperonym argument structure. The lexicalization of PATH (medium locations between the SOURCE and the GOAL) results in the increase of one more overt argument to the argument structure of the hyperonym verb, usually corresponding to a GROUND (external object with respect to which the event is put in perspective) argument realized in object position; the incorporation of restrictions on this semantic component results in the increase of the number of overt arguments, reflected in the selection of an overt argument referring the PATH of the movement event and is introduced by the preposition por (through).
Aktionsart shifts within the wordnet of Portuguese verbs of movement, i.e., hyponyms that display Aktionsart values different from those of their hyperonyms, occur with the lexicalization of GOAL and SOURCE. The lexicalization of the elements SOURCE and GOAL result in accomplishment or achievement type events, since the determination of a specific final location or position (GOAL) or initial location or position (SOURCE) establishes a limit to the event, shifting an activity type event to an accomplishment or achievement type event.
The lexical items representation is done within Generative Lexicon (GL) framework and contemplates three distinct levels – argument structure, event structure and qualia structure. Lexical items are integrated in a lexical inheritance structure.
In order to better characterize the Portuguese verbs of movement, specifically in what concerns subcategorization properties, the modelization of prepositions in WordNet.PT (WN.PT) and their semantic representation at the lexical entry level in the GL framework, is proposed. The integration of prepositions in WN.PT follows previous research on ontological models for the representation of prepositions, namely in what concerns the concepts denoted by prepositions consensually adopted in traditional grammars and state of the art models. This results in a coherent and unified treatment of the semantically full prepositions that introduce verbal arguments but also of argument-marking prepositions.
Using these levels and elements of representation, a complete representation of Portuguese verbs of movement is proposed, accounting for the percolation of information within the lexicon, for the impact of semantic lexicalization in the semantic and syntactic properties of verbs and for verbal co-hyponym compatibility.
The recursive use of available lexical structures allows the percolation of information through the hyponymy trees and enables a coherent and economic codification of the information, including significant subcategorization properties. The resulting lexical structures demonstrate that hyponymy can replace a semantic type lattice in what concerns establishing and defining semantic properties by subtyping strategies. In addition, the permeability granted by the GL model principles, in particular underspecification and co-composition, assures the necessary context flexibility to explain the diversity of syntactic behaviors directly related to lexical semantics properties.
For the definition of a computational lexicon that models the semantic and syntactic properties of lexical items, the integration of informational structures in wordnets is proposed: GL lexical structures provide the structured lexical entries, and WordNet, by its nature, provides the necessary lexical hierarchy that conveys the access to other structures in the lexicon. The integration of GL representation levels in a wordnet, namely argument structure, qualia structure and event structure, demonstrates how wordnets can support a finer-grained lexical description that provides the bases for accounting for several lexical semantic phenomena, without compromising the architecture of the model.
The integration of argument structure information in WN.PT is achieved through the establishment of three new relations: SELECTS/ IS SELECTED BY relation; INCORPORATES/IS INCORPORATED IN relation and SELECTS BY DEFAULT/IS SELECTED BY DEFAULT BY relation. The integration of qualia role in wordnets is attained by associating lexical-conceptual relations to qualia roles, without any loss of information, in what consists of a simple and low cost process.
The expression of event structure in wordnets is accomplished through a new set of features (Event type, Arguments, Subevents, Restrictions and Head) that encode the internal properties of the events. The systematic representation of event structure information, besides providing the grounds for argument order description, enriches the descriptive power of these resources.
The integration of GL representation in wordnets results in richer and more structured repositories of lexical semantic information that contemplate qualia information and allow the extraction of argument structure and event structure information, i.e., generative lexica over which devices such as co-composition, selective binding and coercion can operate. The semantic and syntactic properties considered in the lexical entries of the Portuguese verbs of movement also provided insights on the occurrence restrictions displayed by these verbs in some constructions.
Focusing on the selection of arguments denoting location and GROUND occurring in object position, the expression of directed motion in Portuguese, the occurrence of verbs of movement in middle and non-causative constructions and the distribution of –SE in these constructions, this work also presents the analysis of the different behaviors of Portuguese verbs of movement in these contexts and their relation with the lexical semantic properties of the verbs.
Although not accounting exhaustively for all the different behaviors observed, the lexical semantic characterization proposed constitutes a necessary step to enable the treatment of the observed phenomena and provides some explanations of different behaviors. Verbs that lexicalize SOURCE & GOAL or PATH select defined GROUND objects, i.e., true arguments denoting concrete and bounded entities, syntactically expressed by NPs. The possibility of occurring in directed motion structures, i.e. with PPs that express the SOURCE and GOAL of the movement is directly related to the semantic and syntactic properties of the verb at stake: verbs of change of location license and/or restrict their co-occurrence with these constituents, according to the semantic elements lexicalized by the verbs and to their subcategorization properties.
Regarding also the expression of directed motion in Portuguese, the data show that the distribution of Portuguese verbs of movement with GOAL denoting PPs introduced by the preposition a (roughly corresponding to the English preposition to in some contexts) is conditioned by the type of movement event denoted by the verb (manner of motion vs. directed motion), but also by Aktionsart properties, since PPs introduced by a induce a punctual aspect interpretation of the final state of the event, and refute the analyses of verbs of movement in Romance languages based solely on the co-occurrence restrictions with the preposition a. The correlation between the prominence of an external cause or agent and the impossibility of occurring in non-causative constructions accounts for the distribution of verbs of movement in these constructions: verbs that lexicalize INTENTION or a strong MANNER component implying the action of an external cause or agent do not enter non-causative constructions.
The analysis of the distribution of –se in middle, non-causative and passive constructions lead to the hypothesis of the -se inducing the interpretation of the involvement of an external actor in the denoted event: passives with -se necessarily entail an external cause and thus require the presence of the clitic; in middle constructions, the clitic marks the case where the involvement of an external actor in the denoted event is entailed; and, in non-causative constructions, the clitic marks the correlation between the agent and theme/patient participants of the event, forcing the non-causative reading with [-animated] syntactic subjects.
From this work, it is apparent that the modeling of lexical items of a given POS is not independent from that of others of different POS with which they may occur, which necessarily extended the scope of the analysis depicted here. Moreover, it is demonstrated that modeling lexical items in the WordNet model, establishing a motivated lexical-conceptual inheritance structure, allows for an an economic and adequate description of lexical items and potentiates the construction of large-scale lexical resources suitable for computational purposes.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 23 Apr 2010 |
Place of Publication | Lisboa |
Publication status | Published - 2010 |