I am Cornelius Fritz, an assistant professor at the School for Computer Science and Statistics at Trinity College Dublin. Before that I was a postdoc at Penn State working with Michael Schweinberger and David Hunter on network models under local dependence. I obtained my Ph.D. in statistics under the supervision of Göran Kauermann.
Smart devices collecting interpersonal data surround us at every move and facilitate novel ways of measuring and understanding social behavior. The collected data provide planetary-scale views of online interpersonal relations, allowing a more nuanced look at bias in information diffusion, polarization, and echo chamber effects. In my research, I use statistics to learn from such network data to answer questions posed within the social sciences in uncertain and changing environments.
My research mainly originates from multidisciplinary collaborations with social scientists approaching me with data and questions revolving around networks. As a statistician, I operate in two worlds: the real world, which encompasses observed data with all its imperfections and substantive knowledge of the subject matter, and the model world, which is an artificial representation of the real world characterized by a stochastic model. I develop novel data analysis techniques by combining statistical and machine learning with substantive theory to bridge the gap between the real and model world.
If you have any questions on some of my papers, want to discuss some research topic, or just want to get in touch, you can best reach me via email or Twitter.
Research
My research focuses on developing statistical methodologies and machine learning models for complex, dependent data structures, with a particular emphasis on network science and social systems. My main research directions include:
- Dynamic Networks & Relational Event Models: Developing stochastic frameworks (such as tie-oriented relational and durational event models) to capture time-stamped, event-based social interactions, online behaviors, and international state relations.
- Joint Models for Attributes and Networks: Developing statistical methods and frameworks to jointly model network structure alongside node-level attributes, capturing selection and peer influence dynamics.
- Scalable Network Models: Designing estimation algorithms and diagnostic tools for large-scale networks, specifically hierarchical exponential-family models (ERGM) and signed models (SERGM) containing positive and negative ties.
- Applied Spatio-temporal Modeling: Collaborating with interdisciplinary teams to analyze epidemiological data (e.g., spatio-temporal disease spread forecasting) and political science data (e.g., intrastate conflict forecasting and aircraft trade).
- Fritz, C., & Schweinberger, M. (2026). R Package iglm: Regression under Interference in Connected Populations (Issue arXiv:2604.22791). arXiv. https://doi.org/10.48550/arXiv.2604.22791
Abstract
BibTeX
PDF
@misc{iglm, title = {R Package iglm: Regression under Interference in Connected Populations}, author = {Fritz, Cornelius and Schweinberger, Michael}, year = {2026}, number = {arXiv:2604.22791}, eprint = {2604.22791}, doi = {10.48550/arXiv.2604.22791}, publisher = {arXiv}, keywords = {preprint}, archiveprefix = {arXiv} }We introduce R package iglm, which implements a comprehensive framework for studying relationships among predictors and outcomes under interference. The implemented regression framework facilitates the study of spillover and other phenomena in connected populations and has important advantages over existing packages, among them scalability and provable theoretical guarantees. On the computational side, the regression framework relies on scalable methods that can be applied to small and large data sets, by solving a convex optimization program based on pseudo-likelihoods using Minorization-Maximization and Quasi-Newton algorithms. On the statistical side, the regression framework comes with provable theoretical guarantees. To increase the versatility of iglm, users can add custom-built model terms. We showcase iglm using two data sets, including hate speech on the social media platform X and communications among students. - Fritz, C., Yuan, Y., & Schweinberger, M. (2025+). Scalable Sample-to-Population Estimation of Hyperbolic Space Models for Hypergraphs. arXiv. https://doi.org/10.48550/arXiv.2509.07031
Abstract
BibTeX
PDF
@misc{fritzyuan2025, title = {Scalable Sample-to-Population Estimation of Hyperbolic Space Models for Hypergraphs}, author = {Fritz, Cornelius and Yuan, Yubai and Schweinberger, Michael}, year = {2025+}, doi = {10.48550/arXiv.2509.07031}, publisher = {arXiv}, keywords = {preprint}, archiveprefix = {arXiv} }Hypergraphs are useful mathematical representations of overlapping and nested subsets of interacting units, including groups of genes or brain regions, economic cartels, political or military coalitions, and groups of products that are purchased together. Despite the vast range of applications, the statistical analysis of hypergraphs is challenging: there are many hyperedges of small and large sizes, and hyperedges can overlap or be nested. We develop a novel statistical approach to hypergraphs with overlapping and nested hyperedges of varying sizes and levels of sparsity, which is amenable to scalable sample-to-population estimation with non-asymptotic theoretical guarantees. Our contributions include introducing a probabilistic framework that embeds the units of a hypergraph in an unobserved hyperbolic space, capturing core-periphery structure along with local structure in hypergraphs, developing scalable manifold optimization algorithms for learning hyperbolic space models based on samples from a hypergraph, and demonstrating that the positions of units are identifiable (up to rotations) and providing non-asymptotic theoretical guarantees based on samples from a hypergraph. We apply this framework to detect core-periphery structure and proximity among U.S. politicians based on historical media reports. - Schalberger, M., & Fritz, C. (2025). Scalable signed exponential random graph models under local dependence (Issue arXiv:2507.07660). arXiv. https://doi.org/10.48550/arXiv.2507.07660
Abstract
BibTeX
PDF
@misc{schalberger2025, title = {Scalable {{signed exponential random graph models}} under {{local dependence}}}, author = {Schalberger, Marc and Fritz, Cornelius}, year = {2025}, number = {arXiv:2507.07660}, eprint = {2507.07660}, primaryclass = {cs}, publisher = {arXiv}, doi = {10.48550/arXiv.2507.07660}, keywords = {preprint}, archiveprefix = {arXiv} }Traditional network analysis often relies on binary edges, but real-world relationships are frequently more complex, involving cooperation, neutrality, and conflict. While signed interaction analysis—especially in polarized social media debates—has gained traction, existing methods like Stochastic Block Models (SBM) and Exponential Family Random Graph Models (ERGM) face scalability challenges. Specifically, their reliance on homogeneity assumptions and global dependence becomes unrealistic as network sizes increase. To address these limitations, we propose a novel method that combines SBM and ERGM strengths while mitigating their individual weaknesses. It incorporates local dependence based on non-overlapping blocks, using a two-step approach: first, decomposing the network into sub-networks using an SBM approximation, and second, estimating parameters using ERGM methods. We validate this method on large synthetic networks and apply it to a signed Wikipedia network involving thousands of editors, finding patterns consistent with structural balance theory. - Mehrl, M., Pfander, M., Winner, T., & Fritz, C. (2025). Guardians of the Regime: When and Why Autocrats Create Secret Police (Issue arXiv:2506.10194v1). arXiv. https://doi.org/10.48550/arXiv.2506.10194v1
Abstract
BibTeX
PDF
@misc{mehrl2025, title = {Guardians of the Regime: When and Why Autocrats Create Secret Police}, author = {Mehrl, Marius and Pfander, Mila and Winner, Theresa and Fritz, Cornelius}, year = {2025}, number = {arXiv:2506.10194v1}, eprint = {2506.10194v1}, doi = {10.48550/arXiv.2506.10194v1}, publisher = {arXiv}, keywords = {preprint}, archiveprefix = {arXiv} }The paper explores the institutional choices of autocrats regarding the formation of secret police, which exist in fewer than half of all autocratic country-years. We address the core question: Under what conditions do dictatorships establish secret police organizations? To do so, we develop a theoretical framework for potential predictors and utilize statistical variable selection techniques to analyze factors derived from literature on authoritarian survival and state security forces. We find that secret police are more likely to emerge when rulers face specific, structural, and regime-external threats, such as organized anti-system mobilization, protests, or international rivals. The emergence is also linked to observing successful regime-internal contestation in other countries, which signals similar risks at home. Beyond threats, rulers must possess the sufficient material resources and personalized power necessary to actually establish these organizations. The research contributes to the study of authoritarian politics by identifying the specific conditions—beyond mere repression needs—that lead autocrats to invest in secret police institutions. - Fritz, C., Rastelli, R., Fop, M., & Alberto, C. (2025). Scalable Durational Event Models: Application to Physical and Digital Interactions (Issue arXiv:2504.00049). arXiv. https://doi.org/10.48550/arXiv.2504.00049
Abstract
BibTeX
PDF
@misc{fritz2025, title = {Scalable Durational Event Models: Application to Physical and Digital Interactions}, author = {Fritz, Cornelius and Rastelli, Riccardo and Fop, Michael and Alberto, Caimo.}, year = {2025}, number = {arXiv:2504.00049}, eprint = {2504.00049}, doi = {10.48550/arXiv.2504.00049}, publisher = {arXiv}, keywords = {preprint}, archiveprefix = {arXiv} }Durable interactions are ubiquitous in social network analysis and are increasingly observed with precise time stamps. Phone and video calls, for example, are events to which a specific duration can be assigned. We term data encoding interactions with the start and end times "durational event data". Recent advances in data collection have enabled the observation of such data over extended periods of time and between large populations of actors. Methodologically, we propose the Durational Event Model, an extension of Relational Event Models that decouples the modeling of event incidence from event duration. Computationally, we derive a fast, memory-efficient, and exact block-coordinate ascent algorithm to facilitate large-scale inference. Theoretical complexity analysis and numerical simulations demonstrate computational superiority of this approach over traditional Newton-Raphson-based methods. We apply the model to physical and digital interactions among college students in Copenhagen. Our empirical findings reveal that past interactions are the main drivers of physical interactions, whereas digital interactions are more strongly influenced by friendship ties and prior dyadic contact. - Fritz, C., Georg, C.-P., Mele, A., & Schweinberger, M. (2025). Vulnerability Webs: Systemic Risk in Software Networks. https://doi.org/10.48550/arXiv.2402.13375
Abstract
BibTeX
PDF
@misc{software_dependence, title = {Vulnerability Webs: Systemic Risk in Software Networks}, author = {Fritz, Cornelius and Georg, Co-Pierre and Mele, Angelo and Schweinberger, Michael}, year = {2025}, note = {Under Review}, doi = {10.48550/arXiv.2402.13375}, keywords = {preprint} }Software development frequently relies on code reuse to minimize costs, a practice that creates significant vulnerability risks through dependencies. We examine this phenomenon by analyzing 52,897 dependencies across 16,102 Python repositories using a strategic network formation model that accounts for both observable and unobservable heterogeneity. We highlight how dependency structures can lead to systemic economic shocks, citing high-profile incidents like Heartbleed and CrowdStrike. Through variational approximation of conditional distributions, our study demonstrates that the creation of dependencies generates negative externalities within the software ecosystem. When modeled as a contagion process, the propagation of vulnerabilities suggests that common protection heuristics are often ineffective. As a mitigation strategy, we identify AI-assisted coding as an effective alternative, as it enables developers to replace external dependencies with in-house code, thereby reducing systemic risk. - Fritz, C., & Fuchs-Kreiß, A. (2026). A Counting Process View of Relational Event Models: Practical Asymptotics.
BibTeX
PDF
@misc{Fritz_Fuchs-Kreiss, title = {A Counting Process View of Relational Event Models: Practical Asymptotics}, author = {Fritz, Cornelius and Fuchs-Krei{\ss}, Alexander}, year = {2026}, note = {Working Paper}, keywords = {preprint} }
- Fritz, C., Schweinberger, M., Bhadra, S., & Hunter, D. R. (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, accepted. https://doi.org/10.48550/arXiv.2410.07555
Abstract
BibTeX
PDF
@article{fritzRegressionFrameworkStudying2025, title = {A Regression Framework for Studying Relationships among Attributes under Network Interference}, author = {Fritz, Cornelius and Schweinberger, Michael and Bhadra, Subhankar and Hunter, David R.}, year = {2025}, number = {accepted}, journal = {Journal of the American Statistical Association}, doi = {10.48550/arXiv.2410.07555}, keywords = {publications} }To understand how the interconnected and interdependent world of the twenty-first century operates and make model-based predictions, joint probability models for networks and interdependent outcomes are needed. We propose a comprehensive regression framework for networks and interdependent outcomes with multiple advantages, including interpretability, scalability, and provable theoretical guarantees. The regression framework can be used for studying relationships among attributes of connected units and captures complex dependencies among connections and attributes, while retaining the virtues of linear regression, logistic regression, and other regression models by being interpretable and widely applicable. On the computational side, we show that the regression framework is amenable to scalable statistical computing based on convex optimization of pseudo-likelihoods using minorization-maximization methods. On the theoretical side, we establish convergence rates for pseudo-likelihood estimators based on a single observation of dependent connections and attributes. We demonstrate the regression framework using simulations and an application to hate speech on the social media platform X. - Espinosa-Rada, A., Lerner, J., & Fritz, C. (2025). Socio-cognitive networks between researchers: Investigating scientific dualities with the group-oriented relational hyperevent model. Social Networks, 83, 1–13. https://doi.org/https://doi.org/10.1016/j.socnet.2025.04.005
Abstract
BibTeX
PDF
@article{socio_cognitive_aes_jl_cf, title = {Socio-cognitive networks between researchers: Investigating scientific dualities with the group-oriented relational hyperevent model}, author = {Espinosa-Rada, Alejandro and Lerner, J\"urgen and Fritz, Cornelius}, year = {2025}, volume = {83}, pages = {1-13}, doi = {https://doi.org/10.1016/j.socnet.2025.04.005}, journal = {Social Networks}, keywords = {preprint} }Understanding why researchers cite certain works remains a key question in the study of scientific networks. Prior research has identified factors such as relevance, group cohesion, and source crediting. However, the interplay between cognitive and social dimensions in citation behavior – often conceptualized as a socio-cognitive network – is frequently overlooked, particularly regarding the intermediary steps that lead to a citation. Since a citation first requires a work to be published by a set of authors, we examine how the structure of coauthorship networks influences citation patterns. To investigate this relationship, we analyze the citation and collaboration behavior of Chilean astronomers from 2013 to 2015 using the Group-Oriented Relational Hyperevent Model, which allows us to study coauthorship and citation networks in a joint framework. Our findings suggest that when selecting which works to cite, authors favor recent research and maintain cognitive continuity across cited works. At the same time, we observe that coherent groups – closely connected coauthors – tend to be co-cited more frequently in subsequent publications, reinforcing the interdependence of collaboration and citation networks. - Hegre, H., ..., Chadefaux, T., Fritz, C., & ... (2025). The 2023/24 VIEWS Prediction Challenge: Predicting the Number of Fatalities in Armed Conflict, with Uncertainty. Journal of Peace Research, OnlineFirst. https://doi.org/https://doi.org/10.1177/00223433241300862
Abstract
BibTeX
PDF
@article{hegre2024202324viewspredictionchallenge, title = {The 2023/24 VIEWS Prediction Challenge: Predicting the Number of Fatalities in Armed Conflict, with Uncertainty}, author = {Hegre, Håvard and ... and Chadefaux, Thomas and Fritz, Cornelius and ...}, year = {2025}, journal = {Journal of Peace Research}, volume = {OnlineFirst}, keywords = {publications}, doi = {https://doi.org/10.1177/00223433241300862} }Governmental and nongovernmental organizations have increasingly relied on early-warning systems of conflict to support their decisionmaking. Predictions of war intensity as probability distributions prove closer to what policymakers need than point estimates, as they encompass useful representations of both the most likely outcome and the lower-probability risk that conflicts escalate catastrophically. Point-estimate predictions, by contrast, fail to represent the inherent uncertainty in the distribution of conflict fatalities. Yet, current early warning systems are preponderantly focused on providing point estimates, while efforts to forecast conflict fatalities as a probability distribution remain sparse. Building on the predecessor VIEWS competition, we organize a prediction challenge to encourage endeavours in this direction. We invite researchers across multiple disciplinary fields, from conflict studies to computer science, to forecast the number of fatalities in state-based armed conflicts, in the form of the UCDP ‘best’ estimates aggregated to two units of analysis (country-months and PRIO-GRID-months), with estimates of uncertainty. This article introduces the goal and motivation behind the prediction challenge, presents a set of evaluation metrics to assess the performance of the forecasting models, describes the benchmark models which the contributions are evaluated against, and summarizes the salient features of the submitted contributions. - Fritz, C., Mehrl, M., Thurner, P. W., & Kauermann, G. (2025). Exponential Random Graph Models for Dynamic Signed Networks: An Application to International Relations. Political Analysis, in print. https://doi.org/10.48550/arXiv.2205.13411
Abstract
BibTeX
PDF
@article{fritzmehrl_signed_2025, author = {Fritz, Cornelius and Mehrl, Marius and Thurner, Paul W. and Kauermann, Göran}, title = {Exponential Random Graph Models for Dynamic Signed Networks: An Application to International Relations}, doi = {10.48550/arXiv.2205.13411}, keywords = {publications}, volume = {in print}, journal = {Political Analysis}, year = {2025} }Substantive research in the Social Sciences regularly investigates signed networks, where edges between actors are positive or negative. One often-studied example within International Relations for this type of network consists of countries that can cooperate with or fight against each other. These analyses often build on structural balance theory, one of the earliest and most prominent network theories. While the theorization and description of signed networks have made significant progress, the inferential study of link formation within them remains limited in the absence of appropriate statistical models. We fill this gap by proposing the Signed Exponential Random Graph Model (SERGM), extending the well-known Exponential Random Graph Model (ERGM) to networks where ties are not binary but positive or negative if a tie exists. Since most networks are dynamically evolving systems, we specify the model for both cross-sectional and dynamic networks. Based on hypotheses derived from structural balance theory, we formulate interpretable signed network statistics, capturing dynamics such as “the enemy of my enemy is my friend”. In our empirical application, we use the SERGM to analyze cooperation and conflict between countries within the international state system. We find evidence for structural balance in International Relations.
- Kook, L., Schiele, P., Kolb, C., Dold, D., Arpogaus, M., Fritz, C., Baumann, P., Kopper, P., Pielok, T., Dorigatti, E., & Rügamer, D. (2024). Can inverse conditional flows serve as a substitute for distributional regression model in statistics? Conference on Uncertainty in Artificial Intelligence (UAI), 2029–2046. https://doi.org/10.48550/arXiv.2405.05429
Abstract
BibTeX
PDF
@article{neat, title = {Can inverse conditional flows serve as a substitute for distributional regression model in statistics?}, author = {Kook, Lucas and Schiele, Philipp and Kolb, Chris and Dold, Daniel and Arpogaus, Marcel and Fritz, Cornelius and Baumann, Philipp and Kopper, Philipp and Pielok, Tobias and Dorigatti, Emilio and R\"ugamer, David}, year = {2024}, pages = {2029-2046}, journal = {Conference on Uncertainty in Artificial Intelligence (UAI)}, keywords = {publications}, doi = {10.48550/arXiv.2405.05429 } }Neural network representations of simple models, such as linear regression, are being studied increas- ingly to better understand the underlying principles of deep learning algorithms. However, neural repre- sentations of distributional regression models, such as the Cox model, have received little attention so far. We close this gap by proposing a framework for distributional regression using inverse flow trans- formations (DRIFT), which includes neural rep- resentations of the aforementioned models. We empirically demonstrate that the neural representa- tions of models in DRIFT can serve as a substitute for their classical statistical counterparts in several applications involving continuous, ordered, time- series, and survival outcomes. We confirm that models in DRIFT empirically match the perfor- mance of several statistical methods in terms of es- timation of partial effects, prediction, and aleatoric uncertainty quantification. DRIFT covers both in- terpretable statistical models and flexible neural networks opening up new avenues in both statisti- cal modeling and deep learning.
- De Nicola, G., Fritz, C., Mehrl, M., & Kauermann, G. (2023). Dependence matters: Statistical models to identify the drivers of tie formation in economic networks. Journal of Economic Behavior & Organization, 215, 351–363. https://doi.org/https://doi.org/10.1016/j.jebo.2023.09.021
Abstract
BibTeX
PDF
@article{de_nicola_statistical_2022, title = {Dependence matters: Statistical models to identify the drivers of tie formation in economic networks}, author = {De Nicola, Giacomo and Fritz, Cornelius and Mehrl, Marius and Kauermann, G{\"o}ran}, year = {2023}, volume = {215}, pages = {351-363}, journal = {Journal of Economic Behavior \& Organization}, doi = {https://doi.org/10.1016/j.jebo.2023.09.021}, keywords = {publications} }Networks are ubiquitous in economic research on organizations, trade, and many other areas. However, while economic theory extensively considers networks, no general framework for their empirical modeling has yet emerged. We thus introduce two different statistical models for this purpose – the Exponential Random Graph Model (ERGM) and the Additive and Multiplicative Effects network model (AME). Both model classes can account for network interdependencies between observations, but differ in how they do so. The ERGM allows one to explicitly specify and test the influence of particular network structures, making it a natural choice if one is substantively interested in estimating endogenous network effects. In contrast, AME captures these effects by introducing actor-specific latent variables affecting their propensity to form ties. This makes the latter a good choice if the researcher is interested in capturing the effect of exogenous covariates on tie formation without having a specific theory on the endogenous dependence structures at play. After introducing the two model classes, we showcase them through real-world applications to networks stemming from international arms trade and foreign exchange activity. We further provide full replication materials to facilitate the adoption of these methods in empirical economic research. - Schweinberger, M., & Fritz, C. (2023). Discussion of “A tale of two datasets: Representativeness and generalisability of inference for samples of networks” by Pavel N. Krivitsky, Pietro Coletti, and Niel Hens. Journal of the American Statistical Association, (OnlineFirst), 1–5. https://doi.org/10.1080/01621459.2023.2223680
BibTeX
@article{fritz_discussion, title = {Discussion of “{A} tale of two datasets: {Representativeness} and generalisability of inference for samples of networks” by {P}avel {N}. {Krivitsky}, {P}ietro {C}oletti, and {N}iel {H}ens}, volume = {(OnlineFirst)}, doi = {10.1080/01621459.2023.2223680}, journal = {Journal of the American Statistical Association}, author = {Schweinberger, Michael and Fritz, Cornelius}, year = {2023}, keywords = {publications}, pages = {1--5} } - Fritz, C., De Nicola, G., Kevorg, S., Harhoff, D., & Kauermann, G. (2023). Modelling the large and dynamically growing bipartite network of German patents and inventors. Journal of the Royal Statistical Society. Series A (Statistics in Society), 186(3), 557–576. https://doi.org/10.1093/jrsssa/qnad009
Abstract
BibTeX
PDF
@article{fritz2023, author = {Fritz, Cornelius and De Nicola, Giacomo and Kevorg, Sevag and Harhoff, Dietmar and Kauermann, G{\"{o}}ran}, title = {Modelling the large and dynamically growing bipartite network of German patents and inventors}, volume = {186}, number = {3}, pages = {557–576}, doi = {10.1093/jrsssa/qnad009}, journal = {Journal of the Royal Statistical Society. Series A (Statistics in Society)}, keywords = {publications}, year = {2023} }To explore the driving forces behind innovation, we analyse the dynamic bipartite network of all inventors and patents registered within the field of electrical engineering in Germany in the past two decades. To deal with the sheer size of the data, we decompose the network by exploiting the fact that most inventors tend to only stay active for a relatively short period. We thus propose a Temporal Exponential Random Graph Model with time-varying actor set and sufficient statistics mirroring substantial expectations for our analysis. Our results corroborate that inventor characteristics and team formation are essential to the dynamics of invention. - Rügamer, D., Kolb, C., Fritz, C., Pfisterer, F., Bischl, B., Shen, R., Bukas, C., de Andrade e Sousa, L. B., Thalmeier, D., Baumann, P., Klein, N., & Müller, C. L. (2023). deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression. Journal of Statistical Software, 105(2), 1–31. https://doi.org/10.18637/jss.v105.i02
Abstract
BibTeX
PDF
@article{ruegamer2023, title = {deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression}, author = {Rügamer, David and Kolb, Chris and Fritz, Cornelius and Pfisterer, Florian and Bischl, Bernd and Shen, Ruolin and Bukas, Christina and de Andrade e Sousa, Lisa Barros and Thalmeier, Dominik and Baumann, Philipp and Klein, Nadja and Müller, Christian L.}, year = {2023}, journal = {Journal of Statistical Software}, volume = {105}, number = {2}, pages = {1–31}, doi = {10.18637/jss.v105.i02 }, keywords = {publications} }In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library TensorFlow for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as mgcv. The package’s modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models. - Fritz, C., Mehrl, M., Thurner, P. W., & Kauermann, G. (2023). All that Glitters is not Gold: Relational Events Models with Spurious Events. Network Science, 11(SI 2). https://doi.org/https://doi.org/10.48550/arXiv.2109.10348
Abstract
BibTeX
PDF
@article{fritzmehrl2023, title = {All that Glitters is not Gold: Relational Events Models with Spurious Events}, author = {Fritz, Cornelius and Mehrl, Marius and Thurner, Paul W. and Kauermann, Göran}, year = {2023}, journal = {Network Science}, doi = {https://doi.org/10.48550/arXiv.2109.10348}, volume = {11}, number = {SI 2}, keywords = {publications} }As relational event models are an increasingly popular model for studying relational structures, the reliability of large-scale event data collection becomes more and more important. Automated or human-coded events often suffer from non-negligible false-discovery rates in event identification. And most sensor data is primarily based on actors’ spatial proximity for predefined time windows; hence, the observed events could relate either to a social relationship or random co-location. Both examples imply spurious events that may bias estimates and inference. We propose the Relational Event Model for Spurious Events (REMSE), an extension to existing approaches for interaction data. The model provides a flexible solution for modeling data while controlling for spurious events. Estimation of our model is carried out in an empirical Bayesian approach via data augmentation. Based on a simulation study, we investigate the properties of the estimation procedure. To demonstrate its usefulness in two distinct applications, we employ this model to combat events from the Syrian civil war and student co-location data. Results from the simulation and the applications identify the REMSE as a suitable approach to modeling relational event data in the presence of spurious events.
- Fritz, C., De Nicola, G., Rave, M., Weigert, M., Berger, U., Küchenhoff, H., & Kauermann, G. (2022). Statistical modelling of COVID-19 data: Putting Generalised Additive Models to work. Statistical Modelling, (OnlineFirst). https://doi.org/10.1177/1471082X221124628
Abstract
BibTeX
PDF
@article{fritz2022, author = {Fritz, Cornelius and De Nicola, Giacomo and Rave, Martje and Weigert, Maximilian and Berger, Ursula and Küchenhoff, Helmut and Kauermann, G{\"{o}}ran}, title = {Statistical modelling of COVID-19 data: Putting Generalised Additive Models to work}, keywords = {publications}, year = {2022}, journal = {Statistical Modelling}, volume = {(OnlineFirst)}, doi = {10.1177/1471082X221124628} }Over the course of the COVID-19 pandemic, Generalized Additive Models (GAMs) have been successfully employed on numerous occasions to obtain vital data-driven insights. In this article we further substantiate the success story of GAMs, demonstrating their flexibility by focusing on three relevant pandemic-related issues. First, we examine the interdepency among infections in different age groups, concentrating on school children. In this context, we derive the setting under which parameter estimates are independent of the (unknown) case-detection ratio, which plays an important role in COVID-19 surveillance data. Second, we model the incidence of hospitalizations, for which data is only available with a temporal delay. We illustrate how correcting for this reporting delay through a nowcasting procedure can be naturally incorporated into the GAM framework as an offset term. Third, we propose a multinomial model for the weekly occupancy of intensive care units (ICU), where we distinguish between the number of COVID-19 patients, other patients and vacant beds. With these three examples, we aim to showcase the practical and ‘off-the-shelf’ applicability of GAMs to gain new insights from real-world data. - Fritz, C., De Nicola, G., Günther, F., Rügamer, D., Rave, M., Schneble, M., Bender, A., Weigert, M., Brinks, R., Hoyer, A., Berger, U., Küchenhoff, H., & Kauermann, G. (2022). Challenges in Interpreting Epidemiological Surveillance Data - Experiences from Germany. Journal of Computational and Graphical Statistics, 3.
BibTeX
PDF
@article{fritzetal2022, title = {Challenges in Interpreting Epidemiological Surveillance Data - Experiences from Germany}, author = {Fritz, Cornelius and De Nicola, G. and Günther, F. and Rügamer, D. and Rave, M. and Schneble, M. and Bender, A. and Weigert, M. and Brinks, R. and Hoyer, A. and Berger, U. and Küchenhoff, H. and Kauermann, G.}, year = {2022}, volume = {3}, journal = {Journal of Computational and Graphical Statistics}, keywords = {publications} } - Fritz, C., Dorigatti, E., & Rügamer, D. (2022). Combining Graph Neural Networks and Spatio-temporal Disease Models to Predict COVID-19 Cases in Germany. Scientific Reports, 3930(12), 1–18. https://doi.org/10.1038/s41598-022-07757-5
Abstract
BibTeX
PDF
@article{fritz2021, title = {Combining Graph Neural Networks and Spatio-temporal Disease Models to Predict COVID-19 Cases in Germany}, author = {Fritz, Cornelius and Dorigatti, Emilio and Rügamer, David}, year = {2022}, pages = {1-18}, volume = {3930}, number = {12}, journal = {Scientific Reports}, doi = {10.1038/s41598-022-07757-5}, keywords = {publications} }During 2020, the infection rate of COVID-19 has been investigated by many scholars from different research fields. In this context, reliable and interpretable forecasts of disease incidents are a vital tool for policymakers to manage healthcare resources. In this context, several experts have called for the necessity to account for human mobility to explain the spread of COVID-19. Existing approaches often apply standard models of the respective research field, frequently restricting modeling possibilities. For instance, most statistical or epidemiological models cannot directly incorporate unstructured data sources, including relational data that may encode human mobility. In contrast, machine learning approaches may yield better predictions by exploiting these data structures yet lack intuitive interpretability as they are often categorized as black-box models. We propose a combination of both research directions and present a multimodal learning framework that amalgamates statistical regression and machine learning models for predicting local COVID-19 cases in Germany. Results and implications: the novel approach introduced enables the use of a richer collection of data types, including mobility flows and colocation probabilities, and yields the lowest mean squared error scores throughout the observational period in the reported benchmark study. The results corroborate that during most of the observational period more dispersed meeting patterns and a lower percentage of people staying put are associated with higher infection rates. Moreover, the analysis underpins the necessity of including mobility data and showcases the flexibility and interpretability of the proposed approach. - Berger, U., Fritz, C., & Kauermann, G. (2022). Reihentestungen an Schulen können die Dunkelziffer vonCOVID-19 Infektionen unter Schülern signifikant senken. Das Gesundheitswesen. https://doi.org/10.1055/a-1813-9778
Abstract
BibTeX
PDF
@article{Berger, title = {Reihentestungen an Schulen können die Dunkelziffer vonCOVID-19 Infektionen unter Schülern signifikant senken}, author = {Berger, Ursula and Fritz, Cornelius and Kauermann, Göran}, doi = {10.1055/a-1813-9778}, year = {2022}, journal = {Das Gesundheitswesen}, keywords = {publications} }Aim of the study The aim of this was to study investigate the effectiveness of mandatory Covid-19 tests for in-classroom teaching in reopened schools as a containment measure in the pandemic. In Bavaria, mandatory testing at schools was implemented directly after the Easter vacations in 2021. For the first week after the vacations, this resulted in a natural experiment that allowed us to quantify the impact of the new testing strategy on reported Covid-19 cases.Methods We compared changes in the reported 7-day incidence of new infections between districts with in-classroom teaching at school and districts with closed schools. During the calendar week 15, districts with reported incidences below 100 were allowed to reopen schools and have in-classroom teaching if mandatory COVID-19 testing was performed at school with rapid antigen tests. We do not have data on the rapid test results; however, positive test results in the rapid antigen test were verified by a PCR test, and cases of positive PCR test results were reported at the district level by age groups. In the calendar weeks 13 and 14, all schools in Bavaria were closed due to Easter vacations. Taking into account a latency period of about 3–4 days and a reporting period of 1–2 days, this means that any additional increase in reported incidences for districts with in-class teaching and mandatory testing in the week after the vacation cannot be attributed to transmissions at schools, but reflects the reduction of underreporting due to the newly implemented testing strategy.Results Reported incidence increased by a factor of 6.6 for 5–11 year old and by 1.7 for 12–20 year old pupils in districts with in-classroom teaching and mandatory testing at schools. This increase was accompanied by a reduction in underreporting and was significant compared to districts with school closure. Given the situation of a natural experiment, this increase in the reported incidence among school children can be attributed to the testing strategy. For the same time period, no differences in reported incidences were found for the other age groups.Conclusion In-class teaching with mandatory testing in reopened schools changes the role of schools in the pandemic. Our analyses show that reopening schools with a mandatory testing approach is beneficial from an epidemiologic perspective as it can strongly reduce the dark figure of COVID-19 cases among children. - Fritz, C., & Kauermann, G. (2022). On the Interplay of Regional Mobility, Social Connectedness, and the Spread of COVID-19 in Germany. Journal of the Royal Statistical Society. Series A (Statistics in Society), 185(1). https://doi.org/10.1111/rssa.12753
Abstract
BibTeX
PDF
@article{Fritz2020, author = {Fritz, Cornelius and Kauermann, G{\"{o}}ran}, journal = {Journal of the Royal Statistical Society. Series A (Statistics in Society)}, volume = {185}, number = {1}, keywords = {publications}, title = {On the Interplay of Regional Mobility, Social Connectedness, and the Spread of COVID-19 in Germany}, doi = {10.1111/rssa.12753}, year = {2022} }Since the primary mode of respiratory virus transmission is person-to-person interaction, we are required to reconsider physical interaction patterns to mitigate the number of people infected with COVID-19. While research has shown that non-pharmaceutical interventions (NPI) had an evident impact on national mobility patterns, we investigate the relative regional mobility behaviour to assess the effect of human movement on the spread of COVID-19. In particular, we explore the impact of human mobility and social connectivity derived from Facebook activities on the weekly rate of new infections in Germany between March 3rd and June 22nd, 2020. Our results confirm that reduced social activity lowers the infection rate, accounting for regional and temporal patterns. The extent of social distancing, quantified by the percentage of people staying put within a federal administrative district, has an overall negative effect on the incidence of infections. Additionally, our results show spatial infection patterns based on geographic as well as social distances. - Fritz, C., Mehrl, M., Thurner, P. W., & Kauermann, G. (2022). The Role of Governmental Weapons Procurements in Forecasting Monthly Fatalities in Intrastate Conflicts: A Semiparametric Hierarchical Hurdle Model. International Interactions, 8(4), 778–799. https://doi.org/10.1080/03050629.2022.1993210
Abstract
BibTeX
@article{FritzMehrl2022, author = {Fritz, Cornelius and Mehrl, Marius and Thurner, Paul W. and Kauermann, G{\"{o}}ran}, journal = {International Interactions}, number = {4}, title = {The Role of Governmental Weapons Procurements in Forecasting Monthly Fatalities in Intrastate Conflicts: A Semiparametric Hierarchical Hurdle Model}, doi = {10.1080/03050629.2022.1993210}, year = {2022}, volume = {8}, pages = {778-799}, keywords = {publications} }Accurate and interpretable forecasting models predicting spatially and temporally fine-grained changes in the numbers of intrastate conflict casualties are of crucial importance for policymakers and international non-governmental organizations (NGOs). Using a count data approach, we propose a hierarchical hurdle regression model to address the corresponding prediction challenge at the monthly PRIO-grid level. More precisely, we model the intensity of local armed conflict at a specific point in time as a three-stage process. Stages one and two of our approach estimate whether we will observe any casualties at the country- and grid-cell-level, respectively, while stage three applies a regression model for truncated data to predict the number of such fatalities conditional upon the previous two stages. Within this modeling framework, we focus on the role of governmental arms imports as a processual factor allowing governments to intensify or deter from fighting. We further argue that a grid cell’s geographic remoteness is bound to moderate the effects of these military buildups. Out-of-sample predictions corroborate the effectiveness of our parsimonious and theory-driven model, which enables full transparency combined with accuracy in the forecasting process.
- Fritz, C., Thurner, P. W., & Kauermann, G. (2021). Separable and Semiparametric Network-based Counting Processes applied to the International Combat Aircraft Trades. Network Science, 9(3), 291–311. https://doi.org/10.1017/nws.2021.9
Abstract
BibTeX
PDF
@article{Fritz2021a, author = {Fritz, Cornelius and Thurner, Paul W. and Kauermann, G{\"{o}}ran}, doi = {10.1017/nws.2021.9}, journal = {Network Science}, keywords = {publications}, number = {3}, pages = {291--311}, title = {{Separable and Semiparametric Network-based Counting Processes applied to the International Combat Aircraft Trades}}, volume = {9}, year = {2021} }We propose a novel tie-oriented model for longitudinal event network data. The generating mechanism is assumed to be a multivariate Poisson process that governs the onset and repetition of yearly observed events with two separate intensity functions. We apply the model to a network obtained from the yearly dyadic number of international deliveries of combat aircraft trades between 1950 and 2017. Based on the trade gravity approach, we identify economic and political factors impeding or promoting the number of transfers. Extensive dynamics as well as country heterogeneities require the specification of semiparametric time-varying effects as well as random effects. Our findings reveal strong heterogeneous as well as time-varying effects of endogenous and exogenous covariates on the onset and repetition of aircraft trade events.
- Baumann, S. A., Fritz, C., & Mueller, R. S. (2020). Food antigen-specific IgE in dogs with suspected food hypersensitivity. Tierarztliche Praxis. Ausgabe K, Kleintiere/Heimtiere, 48(6), 395–402. https://doi.org/10.1055/A-1274-9210/ID/R12749210-0044
Abstract
BibTeX
@article{Baumann2020, author = {Baumann, Sandra A. and Fritz, Cornelius and Mueller, Ralf S.}, doi = {10.1055/A-1274-9210/ID/R12749210-0044}, journal = {Tierarztliche Praxis. Ausgabe K, Kleintiere/Heimtiere}, number = {6}, pages = {395--402}, title = {{Food antigen-specific IgE in dogs with suspected food hypersensitivity}}, volume = {48}, year = {2020}, keywords = {publications} }Objective: Knowledge of cross-reactions in food-sensitive dogs will influence the choice of elimination diets and the long-term management of those patients. The objective of this study was to evaluate food allergen-specific IgE tests of suspected allergic dogs for concurrent positive reactions as possible evidence for cross reactions between allergens. Material and methods: Results of serum IgE tests from 760 suspected allergic dogs submitted to 2 laboratories were evaluated statistically. After the tested allergens were grouped by their phylogenetic relationship, odds ratios as well as a sensitivity analysis of the odds ratios were performed to evaluate if concurrent positive IgE results to 2 allergens occurred more often than expected. Results: Within related allergen pairs 27% (laboratory 1) and 72% (laboratory 2) of the pairs could be considered as associated. For the unrelated allergen pairs only 6.8% and 10.6% of the analyzed pairs were considered associated respectively. Strong correlations were shown in the group of ruminant allergens, especially beef and lamb, and grain allergens. High rates of concurrent reactions were also detected in the poultry group, especially between chicken and duck, as well as between pork and ruminant allergens, and soy and grain allergens. Conclusion: As our results showed not only correlations within related but also between non-related allergens, the possible relevance of carbohydrate moieties as well as panallergens for canine hypersensitivities warrants further study. Further investigations are necessary to distinguish co-sensitization from cross-reactions and determine the clinical relevance of food-specific IgE reactivity. Clinical relevance: Due to possible cross reactivity related allergens, especially beef and lamb as well as grain allergens, should not be used for an elimination diet to avoid false results. - Fritz, C., Lebacher, M., & Kauermann, G. (2020). Tempus Volat, Hora Fugit: A survey of Tie‐oriented Dynamic Network Models in Discrete and Continuous Time. Statistica Neerlandica, 74(3), 275–299. https://doi.org/10.1111/stan.12198
Abstract
BibTeX
PDF
@article{Fritz2019, author = {Fritz, Cornelius and Lebacher, Michael and Kauermann, G{\"{o}}ran}, doi = {10.1111/stan.12198}, journal = {Statistica Neerlandica}, keywords = {publications}, number = {3}, pages = {275--299}, title = {{Tempus Volat, Hora Fugit: A survey of Tie‐oriented Dynamic Network Models in Discrete and Continuous Time}}, volume = {74}, year = {2020} }Given the growing number of available tools for modeling dynamic networks, the choice of a suitable model becomes central. The goal of this survey is to provide an overview of tie-oriented dynamic network models. The survey is focused on introducing binary network models with their corresponding assumptions, advantages, and shortfalls. The models are divided according to generating processes, operating in discrete and continuous time. First, we introduce the Temporal Exponential Random Graph Model (TERGM) and the Separable TERGM (STERGM), both being time-discrete models. These models are then contrasted with continuous process models, focusing on the Relational Event Model (REM). We additionally show how the REM can handle time-clustered observations, i.e., continuous time data observed at discrete time points. Besides the discussion of theoretical properties and fitting procedures, we specifically focus on the application of the models on two networks that represent international arms transfers and email exchange. The data allow to demonstrate the applicability and interpretation of the network models.
- Statistical Approaches to Dynamic Networks in Society. Dissertation, LMU Munich
- Dynamic Social Network Models for Time-Stamped Data. Master Thesis, LMU Munich
- Explorative Datenvisualisierung mit Shiny in R.. Bachelor Thesis, LMU Munich
Group
Our research group focuses on statistical modeling, network analysis, and dependent data.
PhD Students
Daniel Seussler (Trinity College Dublin)
Research: Statistical modeling of public health dynamics and healthcare accessibility using data from Madagascar.
Marc Schalberger (FU Berlin)
Research: Network science, signed exponential random graph models (SERGM), and scalable algorithms for large-scale network data.
Packages
bigergm
Fit, Simulate, and Diagnose Hierarchical Exponential-Family Models for Big Networks
Curriculum Vitae
PDF previewing is limited on mobile devices. Use the button below to view or download my complete CV.