Adversarial Training for Unsupervised Bilingual Lexicon Induction

ACL, pp. 1959-1970, 2017.

Cited by: 161|Bibtex|Views29|Links
EI
Keywords:
IJCNLPbilingual word embeddingslow resourceNAACLNIPSMore(13+)
Wei bo:
Our work is likely to benefit from advances in techniques that further stabilize adversarial training

Abstract:

Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form ...More

Code:

Data:

Introduction
  • As word is the basic unit of a language, the betterment of its representation has significant impact on various natural language processing tasks.
  • Soon following the success on monolingual tasks, the potential of word embeddings for crosslingual natural language processing has attracted much attention
  • In their pioneering work, Mikolov caballo cerdo horse pig gato Spanish cat English et al (2013a) observe that word embeddings trained separately on monolingual corpora exhibit isomorphic structure across languages, as illustrated in Figure 1.
  • This has far-reaching implication on low-resource scenarios (Daume III and Jagarlamudi, 2011; Irvine and Callison-Burch, 2013), because word embeddings only require plain text to train, which is the most abundant form of linguistic resource
Highlights
  • As word is the basic unit of a language, the betterment of its representation has significant impact on various natural language processing tasks
  • 2.3 Model 3: Adversarial Autoencoder. As another way to relax the orthogonal constraint, we introduce the adversarial autoencoder (Makhzani et al, 2015), depicted in Figure 2(c)
  • We evaluate the quality of the cross-lingual embedding transformation on the bilingual lexicon induction task
  • For this set of experiments, the data for training word embeddings comes from Wikipedia comparable corpora.5
  • We demonstrate the feasibility of connecting word embeddings of different languages without any cross-lingual signal
  • Our work is likely to benefit from advances in techniques that further stabilize adversarial training
Methods
  • The authors evaluate the quality of the cross-lingual embedding transformation on the bilingual lexicon induction task.
  • The authors run the MonoGiza system as recommended by the toolkit.2
  • It can utilize monolingual embeddings (Dou et al, 2015); in this case, the authors use the same embeddings as the input to the approach.
  • For this set of experiments, the data for training word embeddings comes from Wikipedia comparable corpora.5.
  • For Turkish-English, the authors build a set of ground truth translation pairs in the same way as how the authors obtain seed word translation pairs from Google Translate, described above
Results
  • As shown in Table 4, the MonoGiza baseline still does not work well on these language pairs, while the approach achieves much better performance.
  • The accuracies are high for SpanishEnglish and Italian-English, likely because they are closely related languages, and their embedding spaces may exhibit stronger isomorphism.
  • Gigaword performance on Japanese-Chinese is lower, on a comparable level with Chinese-English, and these languages are relatively distantly related.
  • Turkish-English represents a low-resource scenario, and the lexical semantic structure may be insufficiently captured by the embeddings.
  • The agglutinative nature of Turkish can add to the challenge
Conclusion
  • The authors demonstrate the feasibility of connecting word embeddings of different languages without any cross-lingual signal.
  • This is achieved by matching the distributions of the transformed source language embeddings and target ones via adversarial training.
  • Future work includes investigating other divergences that adversarial training can minimize (Nowozin et al, 2016), and broader mathematical tools that match distributions (Mohamed and Lakshminarayanan, 2016)
Summary
  • Introduction:

    As word is the basic unit of a language, the betterment of its representation has significant impact on various natural language processing tasks.
  • Soon following the success on monolingual tasks, the potential of word embeddings for crosslingual natural language processing has attracted much attention
  • In their pioneering work, Mikolov caballo cerdo horse pig gato Spanish cat English et al (2013a) observe that word embeddings trained separately on monolingual corpora exhibit isomorphic structure across languages, as illustrated in Figure 1.
  • This has far-reaching implication on low-resource scenarios (Daume III and Jagarlamudi, 2011; Irvine and Callison-Burch, 2013), because word embeddings only require plain text to train, which is the most abundant form of linguistic resource
  • Methods:

    The authors evaluate the quality of the cross-lingual embedding transformation on the bilingual lexicon induction task.
  • The authors run the MonoGiza system as recommended by the toolkit.2
  • It can utilize monolingual embeddings (Dou et al, 2015); in this case, the authors use the same embeddings as the input to the approach.
  • For this set of experiments, the data for training word embeddings comes from Wikipedia comparable corpora.5.
  • For Turkish-English, the authors build a set of ground truth translation pairs in the same way as how the authors obtain seed word translation pairs from Google Translate, described above
  • Results:

    As shown in Table 4, the MonoGiza baseline still does not work well on these language pairs, while the approach achieves much better performance.
  • The accuracies are high for SpanishEnglish and Italian-English, likely because they are closely related languages, and their embedding spaces may exhibit stronger isomorphism.
  • Gigaword performance on Japanese-Chinese is lower, on a comparable level with Chinese-English, and these languages are relatively distantly related.
  • Turkish-English represents a low-resource scenario, and the lexical semantic structure may be insufficiently captured by the embeddings.
  • The agglutinative nature of Turkish can add to the challenge
  • Conclusion:

    The authors demonstrate the feasibility of connecting word embeddings of different languages without any cross-lingual signal.
  • This is achieved by matching the distributions of the transformed source language embeddings and target ones via adversarial training.
  • Future work includes investigating other divergences that adversarial training can minimize (Nowozin et al, 2016), and broader mathematical tools that match distributions (Mohamed and Lakshminarayanan, 2016)
Tables
  • Table1: Statistics of the non-parallel corpora. Language codes: zh = Chinese, en = English, es = Spanish, it = Italian, ja = Japanese, tr = Turkish
  • Table2: Chinese-English top-1 accuracies of the MonoGiza baseline and our models, along with the translation matrix (TM) and isometric alignment (IA) methods that utilize 50 and 100 seeds
  • Table3: Top-5 English translation candidates proposed by our approach for some Chinese words. The ground truth is marked in bold
  • Table4: Top-1 accuracies (%) of the MonoGiza baseline and our approach on Spanish-English, ItalianEnglish, Japanese-Chinese, and Turkish-English. The results for translation matrix (TM) and isometric alignment (IA) using 50 and 100 seeds are also listed
  • Table5: Top-1 accuracies (%) of our approach to inducing bilingual lexica for Chinese-English from Wikipedia and Gigaword. Also listed are results for translation matrix (TM) and isometric alignment (IA) using 50 and 100 seeds
  • Table6: Top-5 accuracies (%) of 5k and 10k most frequent words in the French-English setting. The figures for the baselines are taken from (<a class="ref-link" id="cCao_et+al_2016_a" href="#rCao_et+al_2016_a">Cao et al, 2016</a>)
Download tables as Excel
Related work
  • 5.1 Cross-Lingual Word Embeddings for Bilingual Lexicon Induction

    Inducing bilingual lexica from non-parallel data is a long-standing cross-lingual task. Except for the decipherment approach, traditional statistical methods all require cross-lingual signals (Rapp, 1999; Koehn and Knight, 2002; Fung and Cheung, 2004; Gaussier et al, 2004; Haghighi et al, 2008; Vulicet al., 2011; Vulicand Moens, 2013).

    Recent advances in cross-lingual word embeddings (Vulicand Korhonen, 2016; Upadhyay et al., 12As a confirmation, we ran MonoGiza in this setting and obtained comparable performance as reported.

    2016) have rekindled interest in bilingual lexicon induction. Like their traditional counterparts, these embedding-based methods require crosslingual signals encoded in parallel data, aligned at document level (Vulicand Moens, 2015), sentence level (Zou et al, 2013; Chandar A P et al, 2014; Hermann and Blunsom, 2014; Kociskyet al., 2014; Gouws et al, 2015; Luong et al, 2015; Coulmance et al, 2015; Oshikiri et al, 2016), or word level (i.e. seed lexicon) (Gouws and Søgaard, 2015; Wick et al, 2016; Duong et al, 2016; Shi et al, 2015; Mikolov et al, 2013a; Dinu et al, 2015; Lazaridou et al, 2015; Faruqui and Dyer, 2014; Lu et al, 2015; Ammar et al, 2016; Zhang et al, 2016a, 2017; Smith et al, 2017). In contrast, our work completely removes the need for cross-lingual signals to connect monolingual word embeddings, trained on non-parallel text corpora.

    As one of our baselines, the method by Cao et al (2016) also does not require cross-lingual signals to train bilingual word embeddings. It modifies the objective for training embeddings, whereas our approach uses monolingual embeddings trained beforehand and held fixed. More importantly, its learning mechanism is substantially different from ours. It encourages word embeddings from different languages to lie in the shared semantic space by matching the mean and variance of the hidden states, assumed to follow a Gaussian distribution, which is hard to justify. Our approach does not make any assumptions and directly matches the mapped source embedding distribution with the target distribution by adversarial training.
Funding
  • This work is supported by the National Natural Science Foundation of China (No 61522204), the 973 Program (2014CB340501), and the National Natural Science Foundation of China (No 61331013)
  • This research is also supported by the Singapore National Research Foundation under its International Research Centre@Singapore Funding Initiative and administered by the IDM Programme
Reference
  • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, and Noah A. Smith. 2016. Massively Multilingual Word Embeddings. arXiv:1602.01925 [cs] http://arxiv.org/abs/1602.01925.
    Findings
  • Martin Arjovsky and Leon Bottou. 2017. Towards Principled Methods For Training Generative Adversarial Networks. In ICLR. http://arxiv.org/abs/1701.04862.
    Findings
  • Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2016. Learning principled bilingual mappings of word embeddings while preserving monolingual invariance. In EMNLP. http://aclanthology.info/papers/learning-principled-bilingual-mappings-ofword-embeddings-while-preserving-monolingualinvariance.
    Findings
  • Antonio Valerio Miceli Barone. 2016. Towards crosslingual distributed representations without parallel text trained with adversarial autoencoders. In Proceedings of the 1st Workshop on Representation Learning for NLP. https://doi.org/10.18653/v1/ W16-1614.
    Locate open access versionFindings
  • Chris M. Bishop. 199Training with Noise is Equivalent to Tikhonov Regularization. Neural Comput. https://doi.org/10.1162/neco.1995.7.1.108.
    Findings
  • Hailong Cao, Tiejun Zhao, Shu Zhang, and Yao Meng. 201A Distribution-based Model to Learn Bilingual Word Embeddings. In COLING. http://aclanthology.info/papers/a-distributionbased-model-to-learn-bilingual-word-embeddings.
    Findings
  • Sarath Chandar A P, Stanislas Lauly, Hugo Larochelle, Mitesh Khapra, Balaraman Ravindran, Vikas C Raykar, and Amrita Saha. 2014. An Autoencoder Approach to Learning Bilingual Word Representations. In NIPS. http://papers.nips.cc/paper/5270-an-autoencoder-approach-to-learningbilingual-word-representations.pdf.
    Findings
  • Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, and Kilian Weinberger. 2016. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. arXiv:1606.01614 [cs] http://arxiv.org/abs/1606.01614.
    Findings
  • Wenzek, and Amine Benhalloum. 2015.
    Google ScholarFindings
  • http://aclanthology.info/papers/transgram-fast-cross-lingual-word-embeddings.
    Findings
  • Hal Daume III and Jagadeesh Jagarlamudi. 20Domain adaptation for machine translation by mining unseen words. In ACL-HLT. http://aclweb.org/anthology/P11-2071.
    Findings
  • Georgiana Dinu, Angeliki Lazaridou, and Marco Baroni. 2015. Improving Zero-Shot Learning by Mitigating the Hubness Problem. In ICLR Workshop. http://arxiv.org/abs/1412.6568.
    Findings
  • Qing Dou and Kevin Knight. 2012. Large scale decipherment for out-of-domain machine translation. In EMNLP-CoNLL. http://aclweb.org/anthology/D121025.
    Findings
  • Qing Dou and Kevin Knight. 2013. DependencyBased Decipherment for Resource-Limited Machine Translation. In EMNLP. http://aclanthology.info/papers/dependency-based-decipherment-forresource-limited-machine-translation.
    Locate open access versionFindings
  • Qing Dou, Ashish Vaswani, Kevin Knight, and Chris Dyer. 20Unifying Bayesian Inference and Vector Space Models for Improved Decipherment. In ACL-IJCNLP. http://www.aclweb.org/anthology/ P15-1081.
    Findings
  • Long Duong, Hiroshi Kanayama, Tengfei Ma, Steven Bird, and Trevor Cohn. 20Learning Crosslingual Word Embeddings without Bilingual Corpora. In EMNLP. http://aclanthology.info/papers/learning-crosslingual-word-embeddingswithout-bilingual-corpora.
    Findings
  • Manaal Faruqui and Chris Dyer. 2014. Improving Vector Space Word Representations Using Multilingual Correlation. In EACL. http://aclanthology.info/papers/improving-vectorspace-word-representations-using-multilingualcorrelation.
    Findings
  • Pascale Fung and Percy Cheung. 2004. Mining VeryNon-Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and EM. In EMNLP. http://aclweb.org/anthology/W04-3208.
    Findings
  • Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. Journal of Machine Learning Research http://jmlr.org/papers/v17/15-239.html.
    Locate open access versionFindings
  • Eric Gaussier, J.M. Renders, I. Matveeva, C. Goutte, and H. Dejean. 2004. A Geometric View on Bilingual Lexicon Extraction from Comparable Corpora. In ACL. https://doi.org/10.3115/1218955.1219022.
    Findings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
    Locate open access versionFindings
  • Stephan Gouws, Yoshua Bengio, and Greg Corrado. 2015. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments. In ICML. http://jmlr.org/proceedings/papers/v37/gouws15.html.
    Locate open access versionFindings
  • Stephan Gouws and Anders Søgaard. 2015. Simple task-specific bilingual word embeddings. In NAACL-HLT. http://www.aclweb.org/anthology/ N15-1157.
    Findings
  • Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick, and Dan Klein. 2008. Learning Bilingual Lexicons from Monolingual Corpora. In ACL-HLT. http://aclanthology.info/papers/learning-bilinguallexicons-from-monolingual-corpora.
    Findings
  • Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tieyan Liu, and Wei-Ying Ma. 2016. Dual Learning for Machine Translation. In NIPS. http://papers.nips.cc/paper/6469-dual-learning-formachine-translation.pdf.
    Findings
  • Karl Moritz Hermann and Phil Blunsom. 2014. Multilingual Distributed Representations without Word Alignment. In ICLR. http://arxiv.org/abs/1312.6173.
    Findings
  • Ann Irvine and Chris Callison-Burch. 2013. Combining bilingual and comparable corpora for low resource machine translation. In Proceedings of the Eighth Workshop on Statistical Machine Translation. http://aclweb.org/anthology/W13-2233.
    Locate open access versionFindings
  • Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs] http://arxiv.org/abs/1412.6980.
    Findings
  • Philipp Koehn and Kevin Knight. 2002. Learning a Translation Lexicon from Monolingual Corpora. In ACL Workshop on Unsupervised Lexical Acquisition. https://doi.org/10.3115/1118627.1118629.
    Locate open access versionFindings
  • Tomas Kocisky, Karl Moritz Hermann, and Phil Blunsom. 2014. Learning Bilingual Word Representations by Marginalizing Alignments. In ACL. http://aclanthology.info/papers/learning-bilingualword-representations-by-marginalizing-alignments.
    Findings
  • Angeliki Lazaridou, Georgiana Dinu, and Marco Baroni. 2015. Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning. In ACL-IJCNLP. https://doi.org/10.3115/v1/P151027.
    Findings
  • Ang Lu, Weiran Wang, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. Deep Multilingual
    Google ScholarLocate open access versionFindings
  • http://aclanthology.info/papers/
    Findings
  • Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Bilingual Word Representations with Monolingual Quality in Mind. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. http://aclanthology.info/papers/bilingual-wordrepresentations-with-monolingual-quality-in-mind.
    Locate open access versionFindings
  • Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial Autoencoders. arXiv:1511.05644 [cs] http://arxiv.org/abs/1511.05644.
    Findings
  • Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2016. Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections. arXiv:1612.00188 [cs] http://arxiv.org/abs/1612.00188.
    Findings
  • Tomas Mikolov, Quoc V Le, and Ilya Sutskever. 2013a. Exploiting Similarities among Languages for Machine Translation. arXiv:1309.4168 [cs] http://arxiv.org/abs/1309.4168.
    Findings
  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. http://papers.nips.cc/paper/5021-distributed-representations-of-wordsand-phrases-and-their-compositionality.pdf.
    Locate open access versionFindings
  • Shakir Mohamed and Balaji Lakshminarayanan. 2016. Learning in Implicit Generative Models. arXiv:1610.03483 [cs, stat] http://arxiv.org/abs/1610.03483.
    Findings
  • Sebastian Nowozin, Botond Cseke, and Ryota Tomioka. 2016. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. arXiv:1606.00709 [cs, stat] http://arxiv.org/abs/1606.00709.
    Findings
  • Takamasa Oshikiri, Kazuki Fukui, and Hidetoshi Shimodaira. 2016. Cross-Lingual Word Representations via Spectral Graph Embeddings. In ACL. https://doi.org/10.18653/v1/P16-2080.
    Findings
  • Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434 [cs] http://arxiv.org/abs/1511.06434.
    Findings
  • Reinhard Rapp. 1999. Automatic Identification of Word Translations from Unrelated English and German Corpora. In ACL. https://doi.org/10.3115/1034678.1034756.
    Locate open access versionFindings
  • Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS. http://papers.nips.cc/paper/6125-improvedtechniques-for-training-gans.pdf.
    Findings
  • Tianze Shi, Zhiyuan Liu, Yang Liu, and Maosong Sun. 2015. Learning Cross-lingual Word Embeddings via Matrix Co-factorization. In ACL-IJCNLP. http://aclanthology.info/papers/learning-cross-lingualword-embeddings-via-matrix-co-factorization.
    Findings
  • Samuel Smith, David Turban, Steven Hamblin, and Nils Hammerla. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. In ICLR. http://arxiv.org/abs/1702.03859.
    Findings
  • Casper Kaae Sønderby, Jose Caballero, Lucas Theis, Wenzhe Shi, and Ferenc Huszar. 2016. Amortised MAP Inference for Image Super-resolution. arXiv:1610.04490 [cs, stat] http://arxiv.org/abs/1610.04490.
    Findings
  • Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research http://www.jmlr.org/papers/v15/srivastava14a.html.
    Locate open access versionFindings
  • Stephanie Strassel and Jennifer Tracey. 2016. LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages. In LREC. http://www.lrecconf.org/proceedings/lrec2016/pdf/1138 Paper.pdf.
    Locate open access versionFindings
  • Shyam Upadhyay, Manaal Faruqui, Chris Dyer, and Dan Roth. 2016. Cross-lingual Models of Word Embeddings: An Empirical Comparison. In ACL. http://aclanthology.info/papers/cross-lingual-models-ofword-embeddings-an-empirical-comparison.
    Findings
  • Laurens Van der Maaten, Minmin Chen, Stephen Tyree, and Kilian Weinberger. 2013. Learning with Marginalized Corrupted Features. In ICML. http://www.jmlr.org/proceedings/papers/v28/vandermaaten13.html.
    Locate open access versionFindings
  • Ivan Vulicand Anna Korhonen. 2016. On the Role of Seed Lexicons in Learning Bilingual Word Embeddings. In ACL. http://aclanthology.info/papers/on-the-role-of-seed-lexicons-in-learningbilingual-word-embeddings.
    Findings
  • Ivan Vulicand Marie-Francine Moens. 2013. CrossLingual Semantic Similarity of Words as the Similarity of Their Semantic Word Responses. In NAACL-HLT. http://aclanthology.info/papers/cross-lingual-semantic-similarity-of-words-as-thesimilarity-of-their-semantic-word-responses.
    Findings
  • Ivan Vulicand Marie-Francine Moens. 2015.
    Google ScholarFindings
  • http://aclanthology.info/papers/bilingual-wordembeddings-from-non-parallel-document-aligneddata-applied-to-bilingual-lexicon-induction.
    Findings
  • Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015. Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation. In NAACL-HLT. http://aclanthology.info/papers/normalized-word-embedding-and-orthogonaltransform-for-bilingual-word-translation.
    Findings
  • Hyejin Youn, Logan Sutton, Eric Smith, Cristopher Moore, Jon F. Wilkins, Ian Maddieson, William Croft, and Tanmoy Bhattacharya. 2016. On the universal structure of human lexical semantics. Proceedings of the National Academy of Sciences https://doi.org/10.1073/pnas.1520752113.
    Locate open access versionFindings
  • Meng Zhang, Yang Liu, Huanbo Luan, Yiqun Liu, and Maosong Sun. 2016a. Inducing Bilingual Lexica From Non-Parallel Data With Earth Mover’s Distance Regularization. In COLING. http://aclanthology.info/papers/inducing-bilinguallexica-from-non-parallel-data-with-earth-mover-sdistance-regularization.
    Findings
  • Meng Zhang, Haoruo Peng, Yang Liu, Huanbo Luan, and Maosong Sun. 2017. Bilingual Lexicon Induction From Non-Parallel Data With Minimal Supervision. In AAAI. http://thunlp.org/̃zm/publications/aaai2017.pdf.
    Findings
  • Yuan Zhang, David Gaddy, Regina Barzilay, and Tommi Jaakkola. 2016b. Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings. In NAACL-HLT. http://aclanthology.info/papers/ten-pairs-to-tagmultilingual-pos-tagging-via-coarse-mappingbetween-embeddings.
    Findings
  • Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. 2013. Bilingual Word Embeddings for Phrase-Based Machine Translation. In EMNLP. http://aclanthology.info/papers/bilingual-word-embeddings-for-phrase-basedmachine-translation.
    Findings
  • Ivan Vulic, Wim De Smet, and Marie-Francine Moens. 2011. Identifying Word Translations from Comparable Corpora Using Latent Topic Models. In ACL-HLT. http://aclanthology.info/papers/identifying-word-translations-from-comparablecorpora-using-latent-topic-models.
    Findings
  • Stefan Wager, Sida Wang, and Percy S Liang. 2013. Dropout Training as Adaptive Regularization. In NIPS. http://papers.nips.cc/paper/4882-dropouttraining-as-adaptive-regularization.pdf.
    Findings
  • Michael Wick, Pallika Kanani, and Adam Pocock. 2016. Minimally-Constrained Multilingual Embeddings via Artificial Code-Switching. In AAAI. http://www.aaai.org/Conferences/AAAI/2016/Papers/15Wick12464.pdf.
    Findings
Your rating :
0

 

Tags
Comments