Data Mining and Information Retrieval

The research activities conducted at the DM&IR Lab aim at the development of novel models, algorithms and data structures for the extraction and representation of knowledge and for the efficient management of information. Research topics include:

  • Data and Web Mining;
  • Explainable AI;
  • Mobility Data Science;
  • Distributed and Parallel Data-Intensive Algorithms.
Research Group

Collaborators

  • Francesco Busolin (Postdoc)
  • Alberto Veneri (Postdoc)
  • Federico Marcuzzi (INSAIT - Institute for Computer Science, Artificial Intelligence and Technology)
  • Giulia Rovinalle (PhD Student)
  • Ammara Zamir (PhD Student)

Website: https://sites.google.com/unive.it/dmir

Collaborations

Publications

  • Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini: QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees. SIGIR 2015: 73-82. (Best Paper) (ACM Notable Article)
  • Federico Marcuzzi, Claudio Lucchese, Salvatore Orlando: LambdaRank Gradients are Incoherent. CIKM 2023: 1777-1786
  • Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Alberto Veneri:
    GAM Forest Explanation. EDBT 2023: 171-182
  • Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini: QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees. SIGIR 2015: 73-82. (Best Paper) (ACM Notable Article)
  • Travis Gagie, Gonzalo Navarro, and Nicola Prezza. Fully functional suffix trees and optimal text searching in BWT-runs bounded space. Journal of the ACM (JACM). 2020 Jan 15;67(1):1-54. https://doi.org/10.1145/3375890
  • B. Brandoli, A. Raffaetà, M. Simeoni, P. Adibi, F. K. Bappee, F. Pranovi, G. Rovinelli, E. Russo, C. Silvestri, A. Soares, S. Matwin. From multiple aspect trajectories to predictive analysis: a case study on fishing vessels in the Northern Adriatic sea. GeoInformatica, pp. 1--29, March 2022
  • Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, Salvatore Orlando: Treant: training evasion-aware decision trees. Data Min. Knowl. Discov. 34(5): 1390-1420 (2020)
  • Seyum Assefa Abebe, Claudio Lucchese, Salvatore Orlando: EiFFFeL: Enforcing Fairness in Forests by Flipping Leaves. ACM SAC 2021
  • Giulio Ermanno Pibiri. and Rossano Venturini. "Techniques for Inverted Index Compression". ACM Computing Surveys. 53, 6, Article 125, 2021, 36 pages. https://doi.org/10.1145/3415148

Awards

  • 2015 - Best Paper at ACM SIGIR Conference on Research & Development on Information Retrieval

Research projects

MASTER - Multiple aspect trajectories representation and analysis

Multiple ASpects TrajEctoRy management and analysis - (2018-2022) is a Marie Sklodowska-Curie RISE project (Research and Innovation Staff Exchange), which involves 10 international partners and it is intended to strengthen an international thematic network. The project is motivated by the growing number of applications, from mobile phone calls to social media, to land, sea, and air surveillance systems, which produce massive amounts of spatio-temporal data of moving objects. The project aims at developing methods for constructing, managing and analyzing holistic trajectories, i.e., sequences of spatio-temporal points enriched with semantic information coming from heterogeneous data sources, such as social media, Linked Open Data, knowledge bases. For example, in the mentioned contexts, the availability of holistic trajectories allows for the identification and monitoring of the different types of tourist flows, the definition of customized itineraries based on tourists' interests, the knowledge acquisition on fishing patterns to enforce fisheries management and conservation measures worldwide, the identification of the routes of migrants and the detection of the presence of suspicious boats.

Website: http://www.master-project-h2020.eu/

Last update: 22/09/2025