Neural Representations of Natural Language by Lyndon White & Roberto Togneri & Wei Liu & Mohammed Bennamoun
Author:Lyndon White & Roberto Togneri & Wei Liu & Mohammed Bennamoun
Language: eng
Format: epub
ISBN: 9789811300622
Publisher: Springer Singapore
(3.52)
Most words do not co-occur
Some simple reasoning can account for this as a reasonable consequence of Zipf’s law (Zipf 1949) and a prior of the principle of indifference, but there is a further depth to it as explained by Ha et al. (2009).
The question is then: how is the negative sample to be found? One option would be to deterministically search the corpus for these negative samples, making sure to never select words that actually do co-occur. However that would require enumerating the entire corpus. We can instead just pick them randomly, we can sample from the unigram distribution. As statistically, in any given corpus most words do not co-occur, a randomly selected word in all likelihood will not be one that truly does co-occur – and if it is, then that small mistake will vanish as noise in the training, overcome by all the correct truly negative samples.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8315)
Test-Driven Development with Java by Alan Mellor(6856)
Data Augmentation with Python by Duc Haba(6777)
Principles of Data Fabric by Sonia Mezzetta(6518)
Learn Blender Simulations the Right Way by Stephen Pearson(6422)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6285)
Hadoop in Practice by Alex Holmes(5967)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5817)
RPA Solution Architect's Handbook by Sachin Sahgal(5684)
Big Data Analysis with Python by Ivan Marin(5426)
The Infinite Retina by Robert Scoble Irena Cronin(5380)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5164)
Pretrain Vision and Large Language Models in Python by Emily Webber(4392)
Infrastructure as Code for Beginners by Russ McKendrick(4161)
Functional Programming in JavaScript by Mantyla Dan(4048)
The Age of Surveillance Capitalism by Shoshana Zuboff(3965)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3875)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3674)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3652)
