Project Description

Obiettivi

Understanding the principles and tools of information theory (including source coding, channel coding and cryptography) and statistical data mining.

Programma

Information Theory: random variables and processes, the concept of information, self-information, Shannon entropy, alternative entropy measures, relative entropies, Kullback–Leibler divergence, Jensen–Shannon divergence, conditional entropy, joint entropy, mutual information, total correlation, differential entropy, Markov chains. Applications to Communications Systems: coding of discrete sources, first Shannon theorem, Kraft inequality, Huffman coding, discrete communication channels, channel capacity, error probability, Fano inequality, second Shannon theorem, elements of channel coding and cryptography. Applications to Data Mining: basic concept of data mining, definition of dataset and attribute, data types, multivariate analysis, basic statistical description of data, case studies, information theoretic metrics in data mining tasks, data preparation, data cleaning, discretization of attributes, dimensionality reduction, association rules (unidimensional and multidimensional), classification algorithms (ID3, C4.5, Bayes), classification trees, anomaly detection, clustering, training and testing of algorithms, data visualization. Computer experiments: introduction to Matlab, applications of information theory to communications systems, applications of data mining algorithms. TEXTBOOKS: E. Cianca, M. De Sanctis, M. Ruggieri- “Information and Coding: theory overview, Design, Applications and Exercises”, ARACNE Editrice, 2007; M. J. Zaki, W. Meira, “Data Mining and Analysis – Fundamental Concepts and Algorithms”, Cambridge University Press, 2014.
0 crediti
60 ore di lezione
0° Anno
Laurea Magistrale
0° semestre