Project Description

Learning objectives

Understanding the principles and tools of information theory (including source coding, channel coding and cryptography) and statistical data mining.

Course content

Information Theory: random variables and processes, the concept of information, self-information, Shannon entropy, alternative entropy measures, relative entropies, Kullback–Leibler divergence, Jensen–Shannon divergence, conditional entropy, joint entropy, mutual information, total correlation, differential entropy, Markov chains. Applications to Communications Systems: coding of discrete sources, first Shannon theorem, Kraft inequality, Huffman coding, discrete communication channels, channel capacity, error probability, Fano inequality, second Shannon theorem, elements of channel coding and cryptography. Applications to Data Mining: basic concept of data mining, definition of dataset and attribute, data types, multivariate analysis, basic statistical description of data, case studies, information theoretic metrics in data mining tasks, data preparation, data cleaning, discretization of attributes, dimensionality reduction, association rules (unidimensional and multidimensional), classification algorithms (ID3, C4.5, Bayes), classification trees, anomaly detection, clustering, training and testing of algorithms, data visualization. Computer experiments: introduction to Matlab, applications of information theory to communications systems, applications of data mining algorithms. TEXTBOOKS: E. Cianca, M. De Sanctis, M. Ruggieri- “Information and Coding: theory overview, Design, Applications and Exercises”, ARACNE Editrice, 2007; M. J. Zaki, W. Meira, “Data Mining and Analysis – Fundamental Concepts and Algorithms”, Cambridge University Press, 2014.
0 credits
60 hours
0 year
Master Degree
0 semester