Much of the innovation in content mining that powers Nielsen BuzzMetrics’ products is conducted at its science and innovation labs in Israel.
The Nielsen BuzzMetrics staff includes members whose research expertise, academic credentials and product development efforts are rooted in machine learning, natural language processing and automated software solutions that gather, extract, analyze and understand information from text, including Web properties.
The science and innovation team's mandate involves development, discovering, testing and implementing state-of-the-art methods in text mining, machine learning, natural language/computational linguistics, and advanced systems architecture. The methodologies selected for deployment in Nielsen BuzzMetrics’ offerings are scaled to deliver industrial-strength, advanced solutions capable of creating insights and intelligence from real-world, unstructured data and candid consumer conversations.
Research Publications
The following is a selection of research papers published by members of the Nielsen BuzzMetrics team, and their collaborators:
M. Koppel, J. Schler
The Importance of Neutral Examples for Learning Sentiment
To appear in a special issue of the "Computational Intelligence" Journal, about sentiment analysis.
J. Schler, M. Koppel, S. Argamon, J. Pennebaker
Effects of Age and Gender on Blogging
To appear in AAAI Spring 2006 Symposia :: Computational Approaches to Analysing Weblogs
M. Koppel, D. Mughaz and N. Akiva
New Methods for Attribution of Rabbinic Literature , Hebrew Linguistics: A Journal for Hebrew Descriptive, Computational and Applied Linguistics, to appear. 2006.
M. Koppel, N. Akiva and I. Dagan
Feature Instability as a Criterion for Selecting Potential Style Markers
Journal of the American Society for Information Science and Technology (JASIST). 2005.
M. Koppel, J. Schler, K. Zigdon
Automatically Determining an Anonymous Author's Native Language
In Proceedings of the IEEE International Conference on Intelligence and Security Informatics, ISI 2005, Atlanta, GA, USA. May 19-20, 2005.
S. Argamon, N. Akiva, A. Amir, and O. Kapah
Efficient Unsupervised Recursive Word Segmentation Using Minimum Description Length
20th International Conference on computational Linguistics (COLING), August 2004, Geneva, Switzerland.
M. Koppel, N. Akiva and I. Dagan
A Corpus-Independent Feature Set for Style-Based Text Categorization
In Proceedings of IJCAI'03 Workshop on Computational Approaches to Style Analysis and Synthesis, Acapulco, Mexico. 2003.
M. Koppel, D. Myghaz and N. Akiva
CHAT: A System for Stylistic Classification of Hebrew-Aramaic Texts
In Proceedings of OTC-03 Third KDD Workshop on Operational Text Categorization, Washington D.C., 2003.
T. Tomokiyo and M. Hurst.
A Language Model Approach to Keyphrase Extraction.
In ACL-2003Workshop on Multiword Expressions: Analysis, Acquisitionand Treatment. 2003.
C. Zhai, W. Cohen and J. Lafferty.
Beyond Independent Topical Relevance:
Methods and Evaluation Metrics for SubtopicRetrieval.
In Proceedings of 26th Annual InternationalACM SIGIR Conference. 2003.
V. Boyapati, K. Chevrier, A. Finkel, N. S. Glance, T.Pierce, R. Stockton, and C. Whitmer.
ChangeDetector:A Site-Level Monitoring Tool for the WWW.
In Proceedings of the 11th International World Wide Web Conference, 2002.
W. Cohen, M. Hurst, and L. S. Jensen.
A Flexible Learning System for Wrapping Tables and Lists in HTML Document.
In Proceedings of the Eleventh International World Wide Web Conference. 2002.
N. S. Glance.
Community Search Assistant.
In Proceedings of ACM 2001 International Conference on Intelligent User Interfaces. 2001.
N. S. Glance, J.-L.Meunier, P. Bernard, and D. Arregui.
Collaborative Document Monitoring.
In Proceedings of ACM 2001 International Conference on Supporting Group Work. 2001.
L. S. Jensen and W. Cohen.
Grouping Extracted Fields.
In Proceedings of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining. 2001.
R. Sukthankar and R. Stockton.
Argus: The Digital Doorman.
IEEE Intelligent Systems. 16(2). 2001.
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery.
Learning to Construct Knowledge Bases from the World Wide Web.
Artificial Intelligence, 118(1-2). pp 69-114. 2000.
M. Hurst and T. Nasukawa.
Layout and Language: Integrating Spatial and Linguistic Knowledge for Layout Understanding Tasks.
In Proceeding of the 18th International Conference on Computational Linguistics. 2000.
A. McCallum, K. Nigam, J. Rennie, and K. Seymore.
Automating the Construction of Internet Portals with Machine Learning.
Information Retrieval. 3(2). pp. 127-163. 2000.
K. Nigam, A. McCallum, S. Thrun and T. Mitchell.
Text Classification from Labeled and Unlabeled Documents using EM.
Machine Learning, 39(2/3). pp. 103-134. 2000.
R. Stockton and R. Sukthankar.
Wavelet-based Kanji Character Recognition and Completion.
In Proceedings of the International Conference on Pattern Recognition. 2000.
H. Yu, T. Tomokiyo, Z. Wang and A. Waibel.
New Developments in Automatic Meeting Transcription.
In Proceedings of the Sixth International Conference on Spoken Language Processing. 2000.
N. S. Glance, D. Arregui and M. Dardenne.
Making Recommender Systems Work for Organizations.
In Proceedings of Fourth International Conference and Exhibition on The Practical Application of Intelligent Agents and Multi-Agent Technology. 1999.
M. Siegler, M. Witbrock.
Improving the Suitability of Imperfect Transcriptions for Information Retrieval of Spoken Documents.
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. 1999.
R. Inder, M. Hurst and T. Kato.
A Prototype Agent to Assist Shoppers.
Computer Networks and ISDN Systems. 30, pp 643-645. 1998
M. Siegler, A. Berger, M. Witbrock, A. Hauptmann.
Spoken Document Retrieval at CMU.
In Proceedings of TREC-7, The Seventh Text Retrieval Conference. 1998.
W. Cohen.
Learning Rules that Classify E-Mail.
In The 1996 AAAI Spring Symposium on Machine Learning in Information Access. 1996
M. Siegler, U. Jain, B. Raj, and R. Stern.
Automatic Segmentation, Classification and Clustering of Broadcast News Audio.
In Proceedings of the Ninth Spoken Language Systems Technology Workshop. 1996.
S. Douglas, M. Hurst and D. Quinn.
Using Natural Language Processing for Identifying and Interpreting Tables in Plain Text.
In Proceedings Fourth Annual Symposium on Document Analysis and Information Retrieval. 1995.