Abstract (english) | This thesis investigates whether modern computer models can confirm how people encounter words and then use these findings in didactics. In recent years, computers have been used in psycholinguistics, but also in other areas of linguistics. Computer models in psycholinguistics were developed based on Harris’s distributional hypothesis proposed in 1954. For this reason, such computer models are called distributional semantic models (DSM). According to Harris’s hypothesis, words that have similar meanings share a similar context. Harris gave an example by comparing the words oculist and eye-doctor. These two words almost always occur in the same context and are interchangeable. They are synonyms. On the other hand, the word lawyer cannot be used in all contexts where the word oculist or eye-doctor is used, but only in some. Harris concludes this is due to frequency. In other words, whether it is possible to substitute the word oculist for the word lawyer in a certain context depends on how often both words occur in the same or a similar context. While computer models used in computer science are being developed at a rapid pace, psycholinguistic research mainly uses models developed in the 1990s. Nevertheless, more modern computer models have recently been used in psycholinguistics. We have described the models LSA, HAL, BEAGLE and word2vec which are commonly used. Landauer, Foltz and Laham (1998) defined the LSA model as an automatic mathematical and statistical technique used to make inferences and establish relationships between words in context. The context can be considered as a sentence, paragraph, or the entire text. LSA does not use dictionaries, knowledge, semantic networks, grammar and syntactic parsers or anything else created by humans. This model takes as input the raw text. HAL is the second prominent distributional semantic model that Kevin Lund and Curt Burgess have worked on since 1992. HAL has many similarities with the earlier LSA model, but two major differences: 1. These two models have a different conception of context. In LSA, the elements of the cooccurrence matrix are units that form a cross-section of words and multi-sets (passage, part of speech, whole text, or anything that a researcher finds useful for his or her work). However, in HAL these elements are a cross-section of the co-occurrence of two words. 2. While in LSA all words within the same context are equally important, in HAL each word in the context has a different weighting factor, thus it is not equally important (Burgess, Livesay and Lund, 1998). In contrast to these two models, Jones, Kintsch and Mewhort (2006) propose a different model that combines LSA and HAL. Their model stores the collected information in a composite holographic lexicon. The context includes discrete sentences, i.e. complete syntactic structures. They called their model BEAGLE, which is an abbreviation for Bound Encoding of the Aggregate Language. Word2vec is a second generation model. It is not a unique monolithic algorithm. It consists of two separate models: the continuous skip-gram model and the continuous bag-ofwords model (CBOW) (Mikolov et al., 2013a and 2013b). The earlier models (LSA, HAL and BEAGLE) are considered as count models, while word2vec is considered as a predictive model (Baroni, Dinu and Kruszewski, 2014). However, a common feature of word2vec and HAL is that both models use a window of a certain size around the central word that moves through the corpus (Mandera, Keuleers and Brysbaert, 2017). CBOW aims to determine the central word using context words. On the other hand, the skip-gram model predicts the words before or after the central word at a certain distance from it. Corpora are the basis for any DSM. They are the primary source of information on word distributions (Lenci, 2008). Any inferences that have been reached are entirely corpusbased. Therefore, in this study we have given a brief overview of what should be considered with a corpus in DSM research. The mental lexicon is a “storehouse” of words with all important phonetic, phonological, morphological, syntactic and semantic features. Individuals actively use it every day so that the data it contains is always up-to-date. It grows constantly as the individual learns and adopts new words (Erdeljac, 2009). There are two questions that researchers have addressed in studies of the bilingual mental lexicon. The first question was: Are the lexicons of the two languages that bilingual speakers speak separate or shared? In other words, are the units of the lexicon stored in two separate lexicons or in one shared lexicon? Many experiments have been conducted to understand and describe the organisation of the mental lexicon. In the meantime, researchers believe that the mental lexicon of a bilingual speaker is shared between two languages (de Groot, 2011). The second question was: Is the access to the mental lexicon of bilingual speakers selective or non-selective, i.e. whether both lexicons are always active? Many experiments have led to the conclusion that the access to the mental lexicon is non-selective (de Groot, 2011). We want to assess whether DSMs (more precisely, word2vec) resonate how the mental lexicon of a bilingual speaker functions. Hence, we have summarized different theories of the mental lexicon. One major model of the bilingual mental lexicon is the Revised Hierarchical Model. It is based on the work of Kroll and Stewart (1994). They assume that in the early stages of L2 learning, a bilingual relies largely on L1 translations, i.e. on accessing the L2 meanings indirectly through lexical connections. As a bilingual becomes more proficient in L2, he or she increasingly relies on conceptual relations, even though the lexical ones have never been completely deactivated. Kroll and Stewart further emphasise that the lexical links between lexical units and languages and the conceptual links between separate languages and conceptual memory go both ways, but are different in strength. Another important model of the bilingual mental lexicon is the Bilingual Interactive Activation Plus Model (BIA+) (van Heuven and Dijkstra, 2002). BIA+ extends the assumptions of the BIA model about orthographic representations to phonological and semantic representations. Orthographic representations excite phonological representations and these, in turn, excite representations at the next higher level and so on. However, that arousal depends on subjective frequency which suggests that recognition of phonological and semantic representations will occur later in L2 than in L1. Van Heuven and Dijkstra call this phenomenon the Temporal Delay Assumption. It has two consequences: (i) the interlingual effects will be greater in the direction from L1 to L2 than vice versa; and (ii) the absence of interlingual phonological and semantic effects may occur when the task allows a faster response to orthographic and other similar codes. Many models of the mental lexicon have drawn conclusions from experiments with a primed lexical decision task. The key problem in previous research with this task was that the analysis of the results, based on prime and target lists generated by the researcher himself, was always done post hoc with DSM. Thus, only those studies that showed significant statistical effects were considered. We want to determine the predictive power of DSMs as one of the variables that can be manipulated during the experiment. Therefore, this study went in the opposite direction. Thus, we generated a list of prime words with word2vec first. The prime-target pairs were then tested in a primed lexical decision task focusing on German L2 speakers. We have chosen this type of task because it shows better effects than, for instance, the naming task (Lucas, 2000; Hutchison, 2003; Brysbaert et al., 2014). This study was based on the following hypotheses: H1: The reaction time of the targets offered as the most similar to the concepts according to word2vec should reflect the reaction time of the words collected through discrete free associations. This is consistent with the study by Günther, Rinaldi and Marelli (2019), who argue a DSM should not outperform humans in solving behavioural tasks. H2: In the primed lexical decision task, participants will respond fastest for the words that are most similar to the prime. This is consistent with the BIA+ model. Similar words are less distant from each other and are therefore activated more quickly. H3: The reaction time for German targets is shorter when they are preceded by a Croatian prime. This is consistent with the RHM model. Since bilingual speakers have a richer semantic representation of words in L1 than in L2, words in L1 are activated faster than in L2. It is common for students to take part in language research, as they are the most accessible to researchers. Henry (2008) points out that this population is the most convenient for researchers in terms of money and time. Moreover, it is difficult to find enough participants who will spend time on such research. Nevertheless, we had to look for participants outside the usual places where participants are generated, like schools, universities, etc. because of the situation caused by the global COVID-19 pandemic. We conducted the experiments using OpenSesame (Mathôt, Schreij, and Theeuwes, 2012), while PsychoPy (Peirce et al., 2019) was the backend. It measures the speed and accuracy of responses. We created two word lists for both experiments. Each list consisted of 80 pairs of prime and target words. The targets were pseudowords in half of the pairs. The pseudowords corresponded to the phonotactic and orthographic rules of Croatian and German, respectively. They were created with the programme Wuggy (Keuleers and Brysbaert, 2010). The other half consisted of pairs in which each target word was assigned to a prime which was semantically associated with the target word and another prime which was not semantically associated with the target word (Gulan, 2016; McNemara, 2005). The first step in creating associative-semantically connected word pairs was to identify a concept. Each concept was a concrete noun. Subsequently, we identified the word that best corresponded to this concept in Croatian and German. In the first experiment, the associated primes were selected from a list of discrete associations collected by Vezmar (2017) and De Deyne et al. (2018), and in the second experiment, the associated primes were determined using word2vec. 54 participants took part in the first experiment. This experiment confirmed that the participants recognized semantically associated word pairs faster than pairs that were not associated. This is consistent with previous findings (Perea, Duñabeitia, and Carreiras, 2008; Guasch et al., 2011; Bogunović and Ćoso, 2018). BIA+ can explain this phenomenon. Since this model assumes that word meanings are associated with nodes in a shared semantic or lexical network, the recognition of a particular word activates a corresponding node in the network. The activation then automatically spreads further in the network by activating nearby nodes. The closer the nodes are, the faster they are recognized. Regarding linguistic direction, target words seemed to be recognized more quickly when the prime was in German. Huynh and Witzel (2018) claim the current literature on semantically associated priming is inconclusive. Unbalanced bilinguals are assumed to have a richer semantic representation in L1 than in L2 (Duyck and Brybaert, 2004; Schoonbaert et al., 2009). This means the word in the participant's L1 activates more nearby conceptual nodes in the semantic network than a word in L2. Therefore, target words in L2 are recognized faster if an L1 prime precedes them. In contrast, balanced bilingual participants activate an equal number of close conceptual nodes in the semantic network (Guasch et al., 2011). That target words in Croatian were recognized faster when they were preceded by a German prime can be explained by the ghosting effect (Finkbeiner, 2005) or the iconic persistence effect (Coltheart, 1980). This effect was caused by a blank screen of 250 ms occurring between the prime and the target word. This means the prime appears to be displayed longer than it actually is. Bilingual participants use this extra time to process the prime, which then affects the recognition of the target word (Huynh and Witzel, 2018). Hence, our participants had sufficient time to process the primes and therefore responded more quickly to L1 target words. 44 participants took part in the second experiment. We used the same target words as in the first experiment. The results confirmed the results of the first one. Participants recognized target words faster if an associated prime preceded. The results of the first experiment, in terms of the direction of priming, were also repeated. Participants recognized the target word faster when they were in their L1 and the primes in L2. The cosine similarity between words proved to be relevant in determining the priming effect in the second experiment. There was no statistical difference between the first and second experiment. This is consistent with the idea proposed by Günther, Rinaldi, and Marelli (2019). They argue DSMs must be able to complete tasks performed by humans and not outperform them. Sometimes DSMs achieve better results than humans (Mandera, Keuleers, and Brysbaert, 2017). Finally, we have outlined how the insights gained from primed lexical decision tasks can be applied to foreign language methodology and didactics. We addressed how the findings and results of previous studies relate to our findings. We have also summarized the results of this research into recommendations that can help teachers in their everyday work. |