Beyond Models

The Communication Networks Group (CN-Group), led by Tanja Zseby at the TU Wien Institute of Telecommunications, develops data analysis and machine-learning methods for network security, focusing on scalable anomaly detection and traffic analysis under real-world constraints such as encryption, streaming data, and large-scale monitoring. It provides practical resources including Traffic Flow Mapping for visualization, lightweight network-feature research, algorithms such as One-class Decision Tree Fuzzyfier and Sparse Data Observers, and shared datasets and repositories such as the Network Traffic Analysis Database, Covert Timing Channel datasets, Multidimensional Data Cluster Generator, Geometrical Optimum Index indices, and Internet Background Radiation (“darkspace”) captures.

2025-11-18

In an interview with the Center Research Data Management at TU Wien, Félix Iglesias Vázquez discussed data-centric research and reproducible science within the Communication Networks Group. With a background spanning electrical engineering, data analysis, and machine learning, he develops methodologies and versatile algorithms for detecting anomalies in complex real-world datasets, particularly in network traffic, where privacy, security, and anonymisation constraints often limit access to high-quality, well-documented data.

A central theme of Iglesias’ work is the close alignment of theoretical development with practical application. He notes that anomaly-detection methods frequently fail when transferred across domains because they are built on assumptions that do not reflect the structure of real data. In many cases, anomalies are not isolated outliers but appear as dense clusters, novelties, or context-dependent patterns. This insight has led his group to broaden the concept of anomalies and to prioritise dataset relevance, labelling quality, and rich, findable metadata over incremental algorithmic refinement on synthetic benchmarks.

To support reproducibility and open research, Iglesias advocates publishing code in forms that remain usable over time despite evolving software dependencies. He promotes Docker containerisation as a practical solution, enabling pinned libraries and well-defined execution environments that ensure infrastructure-independent and reproducible experiments. Alongside these practices, his group publishes robust and adaptable methods designed to operate across domains, including anomaly-detection systems such as Sparse Data Observers and Go-flows.

Looking ahead, Iglesias expresses cautious optimism about the use of artificial intelligence and large language models in data analysis, particularly as agents for testing and interpreting results in complex environments. At the same time, he cautions that the greatest risk lies not in occasional failures but in systematic, unnoticed errors that can propagate across interconnected systems. For this reason, he emphasises the need for transparency, continuous monitoring, and human oversight from diverse perspectives, with the aim of embedding critical thinking and responsible practice into both research and education.