Andrea Mascaretti

Postdoctoral Researcher at SISSA, Trieste

amascare [at] sissa [dot] it

About

I am a postdoctoral researcher at SISSA in Trieste, Italy. I obtained my Ph.D. in Statistical Sciences from the University of Padua and hold an M.Sc. in Mathematical Engineering from Politecnico di Milano. During my Ph.D., I spent six months at Rice University in Houston.

Outside of work, I enjoy swing dancing, creative writing, and reading.

Research Interests

My primary research focus is Bayesian models and algorithms for structured data. The common thread is that explicitly modelling the dependence between observations buys efficiency: narrower credible intervals, better imputation, and inference that remains reliable in high dimensions. My Ph.D. developed this idea in two directions. For envelope models, I studied mixtures of envelopes, posterior inference on the envelope dimension, and computationally cheaper relaxations of their geometric constraints. For trend filtering on graphs, I built shrinkage priors from the differential operators of the graph.

At SISSA, in Alessandro Laio's group, I study how to compare representations of the same inputs: two layers of a network, two models, two languages. Using the information imbalance, a measure of how well the neighbourhood structure of one representation predicts that of another, we have shown that the semantic information of large language models is spread across many tokens, concentrates in a set of central layers, and is systematically asymmetric across languages, modalities, and model scales. I am currently working on a probabilistic formulation of this problem, casting the alignment of metrics across representations as Bayesian inference.

Much of this work grows out of applied collaborations on input–output tables (the subject of my M.Sc. thesis), neurological signals, clinical prediction, and GPS mobility data. I am always interested in new ones.

Education & Positions

2024 – present

Postdoctoral Researcher, SISSA, Trieste
Semantic information in deep transformer models

2020 – 2024

Ph.D. in Statistical Sciences, University of Padua
Thesis: Bayesian Sparse Model for Complex Data

2022 – 2023

Visiting Ph.D. Student, Rice University, Houston

2019 – 2020

Research Fellow, Politecnico di Milano
SAFARI NJEMA: data-driven public transport development

2016 – 2019

M.Sc. in Mathematical Engineering, Politecnico di Milano
Erasmus exchange at Universidade Nova de Lisboa

2012 – 2015

B.Sc. in Management Engineering, Politecnico di Milano

Awards

2026 — ISBA 2026 Travel Award, International Society for Bayesian Analysis.
2022 — ISBA 2022 Travel Award, International Society for Bayesian Analysis.

Talks & Posters

2026 — Poster, ISBA World Meeting 2026.
2026 — Invited seminar, Data Science Seminar, SISSA, Trieste.
2025 — Contributed talk, “Selection accuracy and errors in sparse models with the horseshoe prior”, CLADAG-VOC 2025, Naples.
2025 — Poster, “An approach to identify the most semantically informative deep representations of text and images”, Youth in High Dimensions, ICTP, Trieste.
2023 — Poster, “Nonparametric mixture of envelope models”, Statistical Methods and Models for Complex Data, Padua.
2023 — Invited talk, “Bayesian Envelope Models”, Ph.D. Students Seminar, Rice University, Houston.
2022 — Contributed talk, “Bayesian Mixtures of Envelope Models”, 36th International Workshop on Statistical Modelling (IWSM), Trieste.
2022 — Poster, “Bayesian nonparametric mixtures of Bayesian envelope models”, ISBA World Meeting 2022, Montréal.
2022 — Contributed talk, “Construction of a proper prior for a Bayesian envelope model”, 51st Meeting of the Italian Statistical Society (SIS 2022), Caserta.

Software

mcts — Monte Carlo Tree Search sampler, parallel C++ (MPI) implementation.
moxier — An introduction to statistics with R: teaching package written during a research fellowship at Politecnico di Milano.

Supervised Students

Annette Dariose Diffo Mboudjiho, Diploma thesis, ICTP.
Andrea Borra, Unsupervised and supervised transport mode detection of GPS phone data: Milano area as a case study, Politecnico di Milano, 2022.
Elena Zazzetti, Accessibility maps from global positioning system data. Two case studies: Maputo and Milan, Politecnico di Milano, 2020.
Irene Azzini, Estimation and analysis of origin destination matrices from global positioning system data, Politecnico di Milano, 2020.
Artem Ugarov, Automatic transport mode detection based on global positioning system data. Analysis of informal mobility in the city of Maputo, Politecnico di Milano, 2020.

Theses

Ph.D. Thesis — Bayesian Sparse Model for Complex Data, University of Padua, 2024.
M.Sc. Thesis — Politecnico di Milano, 2019.