Project Description

Learning objectives

The course aims at:
• Introducing and exploring topics related to data-driven algorithms for the induction of knowledge from large scale data collections;
• Presenting the major data models underlying Web search engines and for Enterprise Search
• Studying technologies and formalisms for the treatment of unstructured Web data through Artifical Intelligence and Natural Language Processing methods and for the linguistic processing of texts and Social Web data
• Introducing experimental practices in application such as Semantic document management, Web Network Analysis and Opinion Mining.

Course content

9 CFU-program:
Section I: Machine Learning and Kernel-based Learning.
Supervised methods. Probabilistic and Generative Methods. Unsupervised Learning. Clustering. Semantic Similarity metrics Agglomerative clustering methods. K-mean.
Markov Models. Hidden Markov Models.
Kernel-based Learning. Polynomial and RBF Kernels. String Kernels. Tree kernels. Latent Semantic kernels. Semantic kernels. Applications.

Section II: Statistical Language Processing
Supervised Language Processing tools. HMM-based POS tagging. Named Entity Recognition. Statistical parsing. PCFGs: Charniak parser. Lexicalized Parsing Methods. Shallow Semantic Parsing: kernel based semantic role labelling. Information Extraction.

Section III: Web Mining & Retrieval.
Ranking Models for the Web. Introduction to Social Network Analysis: rank, centrality.
Random walk models: Page Rank. Web Search Engines. SEO. Google.
Preference Learning for IR.
Question Answering Systems. Open-domain Information Extraction.
Wikipedia knowledge Acquisition. Social Web. Graph-based algorithms for community detection.
Introduction to Opinion Mining and Sentiment Analysis.
Reference Bibliography
• Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. (Available On-line )
• C.M. Bishop “Pattern Recognition and Machine Learning” Springer, 2006
• Roberto Basili, Alessandro Moschitti, Text Categorization: from Information Retrieval to Support Vector Learning, ARACNE Editore, 2005.
• Bing Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. 2nd Edition, July 2011, Springer.
• Further Teaching material (e.g. notes and scientific papers) distributed during the Course.

6 CFU-program:
Section I: Machine Learning and Kernel-based Learning
Overview of Supervised Leanring Methods. Probabilistic methods. Generative Methods. Unsupervised Methods. Clustering. Semantic Similarity metrics. Agglomerative Clustering. K-mean.
Markov Models. Hidden Markov Models.
Kernel-based Learning. Linear, Polynomial and RBF Kernels. String Kernels. Tree kernels. Latent Semantic kernels. Semantic kernels. Applications.
Section II: Web Mining & Retrieval.
Document Ranking Methods for the Web. Introduction to Social Network Analysis: rank and centrality.
Random walk models: Page Rank. Search Engines. SEO. Google.
Question Answering Systems. Open-domain Information Extraction.
Knowledge Acquisition from open sources. Wikipedia. Social Web. Graph Algorithms for community detection.
Introduction to Opinion Mining and Sentiment Analysis.

Professor

0 credits
60 or 90 hours
0 year
Master Degree
0 semester