Scientific Presentations

AI Day 2023

Click here for a detailed schedule
Presentation slides

Scientific talks – Kaleva hall (2nd floor)
10:30-11:30
Session 1: Learning I
12:30-13:30 Session 2: Learning II

Scientific talks – Palaver room (1st floor)
10:30-11:30 Session 3: Applications
12:30-13:30 Session 4: Search and Inference

Scientific talks – Takka room (1st floor)
10:30-11:30 Session 5: Brains, Body, and Mind
12:30-13:30 Session 6: Language, Authorship and Ownership

Scientific posters – Sief/Capitolium hall (2nd floor)
13:30-15:15
Poster session

Scientific talks

Kaleva hall

Session 1: Learning I (10:30-11:30)

Yi Zhao (Aalto University): Simplified Temporal Consistency Reinforcement Learning [click for abstract and full list of authors]

Zhao, Yi, Department of Electrical Engineering and Automation, Aalto University; Zhao, Wenshuai, Department of Electrical Engineering and Automation, Aalto University; Boney, Rinu, Department of Computer Science, Aalto University; Kannala, Juho, Department of Computer Science, Aalto University; Pajarinen, Joni, Department of Electrical Engineering and Automation, Aalto University_

Reinforcement learning (RL) is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this paper we show that, surprisingly, a simple representation learning approach relying only on a latent dynamics model trained by latent temporal consistency is sufficient for high performance RL. This applies when using pure planning with a dynamics model conditioned on the representation, but, also when utilizing the representation as policy and value function features in model-free RL. In experiments, our approach learns an accurate dynamics model to solve challenging high-dimensional locomotion tasks with online planners while being 4.1times faster to train compared to ensemble-based methods. With model-free RL without planning, especially on high-dimensional tasks, such as the DeepMind Control Suite Humanoid and Dog tasks, our approach outperforms model-free methods by a large margin and matches model-based methods' sample efficiency while training 2.4 times faster."

Riikka Huusari (Aalto University): Scalable variable selection for two-view learning tasks with projection operators [click for abstract and full list of authors]

Szedmak, Sandor (Aalto University); Huusari, Riikka (Aalto University); Duong Le, Tat Hong (Aalto University); Rousu, Juho (Aalto University)

In this paper we propose a novel variable selection method for two-view settings, or for vector-valued supervised learning problems. Our framework is able to handle extremely large scale selection tasks, where number of data samples could be even millions. In a nutshell, our method performs variable selection by iteratively selecting variables that are highly correlated with the output variables, but which are not correlated with the previously chosen variables. To measure the correlation, our method uses the concept of projection operators and their algebra. With the projection operators the relationship, correlation, between sets of input and output variables can also be expressed by kernel functions, thus nonlinear correlation models can be exploited as well. We experimentally validate our approach, showing on both synthetic and real data its scalability and the relevance of the selected features.

Bo Zhao (Aalto University): MSRL: Distributed Reinforcement Learning with Dataflow Fragments [click for abstract and full list of authors]

Zhu, Huanzhou (Imperial College London); Zhao, Bo (Aalto University); Chen, Gang (Huawei); Chen, Weifeng (Huawei); Chen, Yijie (Huawei); Shi, Liang (Huawei); Yang, Yaodong (Peking University); Pietzuch, Peter (Imperial College London); Chen, Lei (Hong Kong University of Science and Technology)

A wide range of reinforcement learning (RL) algorithms have been proposed, in which agents learn from interactions with a simulated environment. Executing such RL training loops is computationally expensive, but current RL systems fail to support the training loops of different RL algorithms efficiently on GPU clusters: they either hard-code algorithm-specific strategies for parallelization and distribution; or they accelerate only parts of the computation on GPUs (e.g., DNN policy updates). We observe that current systems lack an abstraction that decouples the definition of an RL algorithm from its strategy for distributed execution. We describe MSRL, a distributed RL training system that uses the new abstraction of a fragmented dataflow graph (FDG) to execute RL algorithms in a flexible way. An FDG is a heterogeneous dataflow representation of an RL algorithm, which maps functions from the RL training loop to independent parallel dataflow fragments. Fragments account for the diverse nature of RL algorithms: each fragment can execute on a different device using its own low-level dataflow implementation, e.g., an operator graph of a DNN engine, a CUDA GPU kernel, or a multi-threaded CPU process. At deployment time, a distribution policy governs how fragments are mapped to devices, without changes to the algorithm implementation. Our experiments show that MSRL exposes trade-offs between different execution strategies, while surpassing the performance of existing RL systems.

Erik Schultheis (Aalto University): Towards Memory-Efficient Training for Extremely Large Output Spaces Learning with 500k Labels on a Single Commodity GPU [click for abstract and full list of authors]

Schultheis, Erik (Aalto University);Babbar, Rohit (Aalto University, University of Bath)

In classification problems with large output spaces (up to millions of labels), the weight matrix for the classification layer requires an enormous amount of memory. Using sparse connectivity drastically reduces the memory requirements, but can result in much diminished predictive performance of the model. Fortunately, we found that this can be mitigated by introducing an intermediate layer of intermediate size. We further demonstrate that one can constrain the connectivity of the sparse layer to be uniform, in the sense that each output neuron will have the exact same number of incoming connections, which allows for efficient implementations of sparse matrix multiplication and connection redistribution on a GPU.

 

Kaleva hall

Session 2: Learning II (12:30-13:30)

Marlon Tobaben (University of Helsinki & FCAI): On the Efficacy of Differentially Private Few-shot Image Classification [click for abstract and full list of authors]

Tobaben, Marlon (University of Helsinki); Shysheya, Aliaksandra (University of Cambridge); Bronskill, John F (University of Cambridge); Paverd, Andrew (Microsoft); Tople, Shruti (Microsoft); Zanella-Beguelin, Santiago (Microsoft); Turner, Richard E (University of Cambridge); Honkela, Antti (University of Helsinki)

There has been significant recent progress in training differentially private (DP) models which achieve accuracy that approaches the best non-private models. These DP models are typically pretrained on large public datasets and then fine-tuned on private downstream datasets that are relatively large and similar in distribution to the pretraining data. However, in many applications including personalization and federated learning, it is crucial to perform well (i) in the few-shot setting, as obtaining large amounts of labeled data may be problematic; and (ii) on datasets from a wide variety of domains for use in various specialist settings. To understand under which conditions few-shot DP can be effective, we perform an exhaustive set of experiments that reveals how the accuracy and vulnerability to attack of few-shot DP image classification models are affected as the number of shots per class, privacy level, model architecture, downstream dataset, and subset of learnable parameters in the model vary. We show that to achieve DP accuracy on par with non-private models, the shots per class must be increased as the privacy level increases by as much as $20-35\times$ at $\epsilon=1$. We also show that learning parameter-efficient FiLM adapters under DP is competitive with and often superior to learning just the final classifier layer or learning all of the network parameters. Finally, we evaluate DP federated learning systems and establish state-of-the-art performance on the challenging FLAIR benchmark.

Hanlin Yu (University of Helsinki): Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics [click for abstract and full list of authors]

Yu, Hanlin (University of Helsinki); Hartmann, Marcelo (University of Helsinki); Williams, Bernardo (University of Helsinki); Klami, Arto (University of Helsinki)

Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic-gradient samplers to improve convergence and exploration but have only a minor computational overhead over diagonal metrics. We show that for fully connected neural networks (NNs) with sparsity-inducing priors and convolutional NNs with correlated priors, using these metrics can provide improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.

Antti Kuusisto (Tampere University): Short Boolean Formulas as Explanations in Practice [click for abstract and full list of authors]

Jaakkola, Reijo (Tampere University); Janhunen, Tomi (Tampere University); Kuusisto, Antti (Tampere University); Rankooh, Masood F. (Tampere University); Vilander, Miikka (Tampere University)

One of the key challenges in modern AI relates to explainability and human interpretability of AI-based tools. While scientifically interesting, explainability is also a timely topic for societal reasons. For example, the General Data Protection Regulation (GDPR) of the European Union refers to the right of individuals to obtain explanations of decisions made about them by automated means. In the current work, we investigates explainability via short Boolean formulas in the data model based on unary relations. As an explanation of length k, we take a Boolean formula of length k that minimizes the error with respect to the target attribute to be explained. We provide novel quantitative bounds for the expected error in this scenario and also demonstrate how the setting works in practice by studying concrete data sets. For each data set, we find explanation formulas of different lengths using Answer Set Programming, stopping at the threshold where overfitting begins. The most accurate formulas achieve errors similar to other methods on the same data sets, including neural networks. The obtained formulas are short and immediately human interpretable.

Juha Karvanen (University of Jyväskylä): Clustering and Structural Robustness in Causal Diagrams [click for abstract and full list of authors]

Tikka, Santtu; Helske, Jouni; Karvanen, Juha

Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical approach may become impractical, and the clarity of the representation is lost. Clustering of variables is a natural way to reduce the size of the causal diagram, but it may erroneously change the essential properties of the causal relations if implemented arbitrarily. We define a specific type of cluster, called transit cluster, that is guaranteed to preserve the identifiability properties of causal effects under certain conditions. We provide a sound and complete algorithm for finding all transit clusters in a given graph and demonstrate how clustering can simplify the identification of causal effects. We also study the inverse problem, where one starts with a clustered graph and looks for extended graphs where the identifiability properties of causal effects remain unchanged. We show that this kind of structural robustness is closely related to transit clusters.

 

Palaver room

Session 3: Applications (10:30-11:30)

Qing Liu (CMVS, University of Oulu): Many Birds, One Stone: General Medical Image Segmentation with Multiple Partially Labelled Datasets [click for abstract and full list of authors]

Liu, Qing; Zeng, Hailong; Sun, Zhaodong; Li, Xiaobai; Zhao, Guoying; Liang, Yixiong

Farzeen Munir (Aalto University and FCAI): Radar-Lidar Fusion for Object Detection by Designing Effective Convolution Networks [click for abstract and full list of authors]

Munir, Farzeen; Azam, Shoaib; Kucner, Tomasz ; Kyrki,Ville ; Jeon,Moongu

Object detection is a core component of perception systems, providing the ego vehicle with information about its surroundings to ensure safe route planning. While cameras and Lidar have significantly advanced perception systems, their performance can be limited in adverse weather conditions. In contrast, millimeter-wave technology enables radars to function effectively in such conditions. However, relying solely on radar for building a perception system doesnt fully capture the environment due to the datas sparse nature. To address this, sensor fusion strategies have been introduced. We propose a dual-branch framework to integrate radar and Lidar data for enhanced object detection. The primary branch focuses on extracting radar features, while the auxiliary branch extracts Lidar features. These are then combined using additive attention. Subsequently, the integrated features are processed through a novel Parallel Forked Structure (PFS) to manage scale variations. A region proposal head is then utilized for object detection. We evaluated the effectiveness of our proposed method on the Radiate dataset using COCO metrics. The results show that it surpasses state-of-the-art methods by 1.89% and 2.61% in favorable and adverse weather conditions,respectively. This underscores the value of radar-Lidar fusion in achieving precise object detection and localization, especially in challenging weather conditions.

Tuomo Hartonen (Institute for Molecular Medicine Finland, University of Helsinki): Nationwide health, socio-economic and genetic predictors of COVID-19 vaccination status in Finland [click for abstract and full list of authors]

Tuomo Hartonen 1, Bradley Jermy 1, Hanna Sõnajalg 2, Pekka Vartiainen 1, Kristi Krebs 2, Andrius Vabalas 1; FinnGen; Estonian Biobank Research Team; Tuija Leino 3, Hanna Nohynek 3, Jonas Sivelä 3, Reedik Mägi 2, Mark Daly 1, 4, 5, 6, Hanna M Ollila 1, 4, 7, 8, Lili Milani 2, Markus Perola 3, Samuli Ripatti, 1, 4, 5, 9, Andrea Ganna, 10, 11, 12

Affiliations: 1 Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland. 2 Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia. 3 Finnish Institute for Health and Welfare, Helsinki, Finland. 4 Broad Institute of MIT and Harvard, Cambridge, MA, USA. 5 Massachusetts General Hospital, Cambridge, MA, USA. 6 Harvard Medical School, Cambridge, MA, USA. 7 Center of Genomic Medicine, Harvard Medical School, Boston, MA, USA. 8 Anesthesia, Critical Care, and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA. 9 Department of Public Health, University of Helsinki, Helsinki, Finland. 10 Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland. andrea.ganna@helsinki.fi. 11 Broad Institute of MIT and Harvard, Cambridge, MA, USA. 12 Massachusetts General Hospital, Cambridge, MA, USA.

Understanding factors associated with COVID-19 vaccination can highlight issues in public health systems. Using machine learning, we considered the effects of 2,890 health, socio-economic and demographic factors in the entire Finnish population aged 30-80 and genome-wide information from 273,765 individuals. The strongest predictors of vaccination status were labour income and medication purchase history. Mental health conditions and having unvaccinated first-degree relatives were associated with reduced vaccination. A prediction model combining all predictors achieved good discrimination (area under the receiver operating characteristic curve, 0.801; 95% confidence interval, 0.799-0.803). The 1% of individuals with the highest predicted risk of not vaccinating had an observed vaccination rate of 18.8%, compared with 90.3% in the study population. We identified eight genetic loci associated with vaccination uptake and derived a polygenic score, which was a weak predictor in an independent subset. Our results suggest that individuals at higher risk of suffering the worst consequences of COVID-19 are also less likely to vaccinate.

Nadia M. Ady (Aalto University): Interdisciplinary Methods in Computational Creativity: How Human Variables Shape Human-Inspired AI Research [click for abstract and full list of authors]

Ady, Nadia M. (Aalto University); Rice, Faun (Independent)

The word creativity originally described a concept from human psychology, but in the realm of computational creativity (CC), it has become much more. The question of what creativity means when it is part of a computational system might be considered core to CC. Pinning down the meaning of creativity, and concepts like it, becomes salient when researchers port concepts from human psychology to computation, a widespread practice extending beyond CC into artificial intelligence (AI). Yet, the human processes shaping human-inspired computational systems have been little investigated. Starting with data from 22 in-depth, semi-structured interviews, this talk explores questions about which human literatures (social sciences, psychology, neuroscience) enter AI scholarship, how they are translated at the port of entry, and what that might mean for AI.

 

Palaver room

Session 4: Search and Inference (12:30-13:30)

Kalle Kujanpää (Aalto University): Hybrid Search for Efficient Planning with Completeness Guarantees [click for abstract and full list of authors]

Kujanpää, Kalle (Aalto University); Pajarinen, Joni (Aalto University); Ilin, Alexander (Aalto University)

Solving complex planning problems has been a long-standing challenge in computer science. Learning-based subgoal search methods have shown promise in tackling these problems, but they often suffer from a lack of completeness, meaning that they may fail to find a solution even if one exists. In this paper, we propose an efficient approach to augment a subgoal search method to achieve completeness in discrete action spaces. Specifically, we augment the high-level search with low-level actions to execute a multilevel search, which we call complete subgoal search. This solution achieves the best of both worlds: the practical efficiency of high-level search and the completeness of low-level search. We apply the proposed search method to a recently proposed subgoal search algorithm and evaluate the algorithm trained on offline data on complex planning problems. We demonstrate that our complete subgoal search not only guarantees completeness but can even improve performance in terms of search expansions for instances that the high-level could solve without low-level augmentations. Our approach makes it possible to apply subgoal-level planning for systems where completeness is a critical requirement.

Christoph Jabs (University of Helsinki): MaxSAT-Based Bi-Objective Boolean Optimization [click for abstract and full list of authors]

Jabs, Christoph (University of Helsinki); Berg, Jeremias (University of Helsinki); Niskanen, Andreas (University of Helsinki); Järvisalo, Matti (University of Helsinki)

We explore a maximum satisfiability (MaxSAT) based approach to bi-objective optimization. Bi-objective optimization refers to the task of finding so-called Pareto-optimal solutions in terms of two objective functions. Bi-objective optimization problems naturally arise in various real-world settings. For example, in the context of learning interpretable representations, such as decision rules, from data, one wishes to balance between two objectives, the classification error and the size of the representation. Our approach is generally applicable to bi-objective optimizations which allow for propositional encodings. The approach makes heavy use of incremental Boolean satisfiability (SAT) solving and draws inspiration from modern MaxSAT solving approaches. In particular, we describe several variants of the approach which arise from different approaches to MaxSAT solving. In addition to computing a single representative solution per each point of the Pareto front, the approach allows for enumerating all Pareto-optimal solutions. We empirically compare the efficiency of the approach to recent competing approaches, showing practical benefits of our approach in the contexts of learning interpretable classification rules and bi-objective set covering.

Aleksanteri Sladek (Aalto University): Encoding Negative Dependencies in Probabilistic Circuits [click for abstract and full list of authors]

Sladek, Aleksanteri (Aalto University); Trapp, Martin (Aalto University); Solin, Arno (Aalto University)

Tractability is considered key to trustworthy decision-making under uncertainty, but it often comes at the expense of the ability to represent large families of probability distributions. Probabilistic circuits promise to remedy this by representing tractable yet expressive probabilistic models through hierarchical compositions of tractable distributions subject to certain structural and parameter constraints. A common parameter constraint enforced in these models is non-negativity of the weights, which has been shown by prior work to potentially hinder their expressive efficiency. In this work, we propose allowing for negative weights in probabilistic circuits by loosening the non-negativity constraint to a positive semidefinite constraint. We empirically show that probabilistic circuits with positive semidefinite parameterized nodes have increased expressive efficiency, whilst retaining tractability, and empirically outperform circuits with non-negative weight constraints.

Jaakko Peltonen (Tampere University): Fair Neighbor Embedding [click for abstract and full list of authors]

Peltonen, Jaakko (Tampere University); Xu, Wen (Tampere University); Nummenmaa, Timo (Tampere University); Nummenmaa, Jyrki (Tampere University)

We consider fairness in dimensionality reduction (DR). Nonlinear DR yields low dimensional representations that let users visualize and explore high-dimensional data. However, traditional DR may yield biased visualizations overemphasizing relationships of societal phenomena to sensitive attributes or protected groups. We introduce a framework of fair neighbor embedding, the Fair Neighbor Retrieval Visualizer, formulating fair nonlinear DR as an information retrieval task with performance and fairness quantified by information retrieval criteria. The method optimizes low-dimensional embeddings that preserve high-dimensional data neighborhoods without biased association of such neighborhoods to protected groups. In experiments the method yields fair visualizations outperforming previous methods.

 

Takka room

Session 5: Brains, Body and Mind (10:30-11:30)

Daolang Huang (Aalto University): Practical Equivariances via Relational Conditional Neural Processes [click for abstract and full list of authors]

Huang, Daolang (Aalto University); Haussmann, Manuel (Aalto University); Remes, Ulpu (University of Helsinki); John, ST (Aalto University); Clarté, Grégoire (University of Helsinki); Sebastian Luck, Kevin (Aalto University); Kaski, Samuel (Aalto University); Acerbi, Luigi (University of Helsinki)

Conditional Neural Processes (CNPs) are a class of metalearning models popular for combining the runtime efficiency of amortized inference with reliable uncertainty quantification. Many relevant machine learning tasks, such as spatio-temporal modeling, Bayesian Optimization and continuous control, contain equivariances for example to translation which the model can exploit for maximal performance. However, prior attempts to include equivariances in CNPs do not scale effectively beyond two input dimensions. In this work, we propose Relational Conditional Neural Processes (RCNPs), an effective approach to incorporate equivariances into any neural process model. Our proposed method extends the applicability and impact of equivariant neural processes to higher dimensions. We empirically demonstrate the competitive performance of RCNPs on a large array of tasks naturally containing equivariances.

Nitin Williams (Aalto University, Finland): Simulation-Based Inference in Human Brain Computational Models [click for abstract and full list of authors]

Williams, Nitin (Aalto University, Finland); Ojanperä, Anttoni (Aalto University, Finland); Siebenhühner, Felix (Helsinki University, Finland); Toselli, Benedetta (University of Genoa, Italy); Palva, Satu (Helsinki University, Finland); Arnulfo, Gabriele (University of Genoa, Italy); Kaski, Samuel (Aalto University, Finland); Palva, J. Matias (Aalto University, Finland).

Large-scale brain networks might regulate the communication between brain regions fundamental to cognition, but how these networks are generated remains poorly understood. Biophysical Network Models (BNMs) comprising models of brain regions linked by neuroanatomically informed connections have been used to represent biologically plausible hypotheses on the generation of brain networks. Yet, the intractable likelihood of these BNMs render standard likelihood-based methods inapplicable to fit these BNMs to empirical data, and to choose between different BNMs based on empirical data. In this project, we used methods from Approximate Bayesian Computation (ABC) to fit and compare three BNMs with different biologically plausible assumptions on the role of inter-regional delays in generating large-scale brain networks. Fitting the BNMs with Bayesian Optimisation for Likelihood Free Inference (BOLFI) yielded reliably estimated posterior distributions while the results of ABC model comparison suggested the BNM with distance-dependent conduction delays as most probable. The neuroscientific significance of our study is in identifying distance-dependent inter-regional conduction delays as likely to underlie the generation of large-scale brain networks. The technical significance of our study lies in the close integration between models and data achieved by employing ABC methods to compare BNMs representing biologically plausible hypotheses, with empirical MEG data.

Haoyu Chen (University of Oulu): SMG: A Micro-gesture Dataset Towards Spontaneous Body Gestures for Emotional Stress State Analysis [click for abstract and full list of authors]

Haoyu Chen (CMVS, University of Oulu, Finland), Henglin Shi (CMVS, University of Oulu, Finland), Xin Liu (Lappeenranta-Lahti University of Technology LUT, Finland), Xiaobai Li (Zhejiang University, China) & Guoying Zhao (CMVS, University of Oulu, Finland)

We explore using body gestures for hidden emotional state analysis. As an important non-verbal communicative fashion, human body gestures are capable of conveying emotional information during social communication. In previous works, efforts have been made mainly on facial expressions, speech, or expressive body gestures to interpret classical expressive emotions. Differently, we focus on a specific group of body gestures, called micro-gestures (MGs), used in the psychology research field to interpret inner human feelings. MGs are subtle and spontaneous body movements that are proven, together with micro-expressions, to be more reliable than normal facial expressions for conveying hidden emotional information. In this work, a comprehensive study of MGs is presented from the computer vision aspect, including a novel spontaneous micro-gesture (SMG) dataset with two emotional stress states and a comprehensive statistical analysis indicating the correlations between MGs and emotional states. Novel frameworks are further presented together with various state-of-the-art methods as benchmarks for automatic classification, online recognition of MGs, and emotional stress state recognition. The dataset and methods presented could inspire a new way of utilizing body gestures for human emotion understanding and bring a new direction to the emotion AI community. The source code and dataset are made available: https://github.com/mikecheninoulu/SMG.

Fatemeh Sarhaddi (University of Turku): A Deep Learning-based PPG Quality Assessment Approach for Heart Rate and Heart Rate Variability [click for abstract and full list of authors]

Naeini, Emad Kasaeyan, University of California, Irvine; Sarhaddi, Fatemeh, University of Turku; Azimi, Iman, University of California, Irvine; Liljeberg, Pasi, University of Turku; Dutt, Nikil, University of California, Irvine; Rahmani, Amir M.,University of California, Irvine;

Photoplethysmography (PPG) is a non-invasive optical method to monitor heart rate (HR) and heart rate variability (HRV). The PPG method is highly susceptible to motion artifacts and environmental noise, which are inevitable when users are involved in various activities in their daily routines. The low-quality PPG signals negatively affect the accuracy of the extracted parameters, leading to inaccurate decision-making. PPG-based health monitoring necessitates a quality assessment (QA) approach to determine the signal quality and ensure the quality of health parameters. Various studies have introduced PPG signal QA methods, leveraging different indicators and machine learning algorithms. These methods distinguish between reliable and unreliable signals by considering morphological features of the PPG signal and focusing on cardiac cycles. Consequently, they are suitable for HR detection applications. However, they do not apply to HRV, as only having an acceptable shape is insufficient, and other signal factors may affect the accuracy. In this paper, we propose a deep learning-based PPG QA method for HR and various HRV parameters. We employ one customized one-dimensional (1D) and three two-dimensional (2D) Convolutional Neural Networks (CNNs) to train models for each parameter. We evaluate the reliability of these parameters against corresponding electrocardiogram signals, utilizing 210 hours of data collected in real-life settings. Our results demonstrate that the proposed 1D CNN method outperforms the other 2D CNN approaches for HR and HRV parameters, achieving an accuracy exceeding 90%. Furthermore, we compare our best models for HR-HRV health parameters with six different state-of-the-art PPG signal QA methods. Our findings indicate that our proposed method surpasses the other methods. We also provide the open-source model implemented in Python for the community to integrate into their solutions.

 

Takka room

Session 6: Language, Authorship and Ownership (12:30-13:30)

Tommi Jauhiainen (University of Helsinki): HeLI-OTS, Off-the-shelf Language Identifier for Text [click for abstract and full list of authors]

Jauhiainen, Tommi (University of Helsinki); Jauhiainen, Heidi (University of Helsinki); Lindén, Krister (University of Helsinki)

Tamas Grosz (Aalto University): Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model [click for abstract and full list of authors]

Grosz, Tamas (Aalto University); Getman, Yaroslav (Aalto University); Al-Ghezi, Ragheb (Aalto University); Rouhe, Aku (Aalto University); Kurimo, Mikko (Aalto University)

Self-supervised speech models, such as the wav2vec2, have become extremely popular in the past few years. Their main appeal is that after their pre-training on a large amount of audio, they require only a small amount of supervised, finetuning data to achieve outstanding results. Despite their immense success, very little is understood about the pre-trained models and how finetuning changes them. In this work, we take the first steps towards a better understanding of wav2vec2 systems using model interpretation tools such as visualization and latent embedding clustering. Through our analysis, we gain new insights into the abilities of the pre-trained networks and the effect that finetuning has on them. We demonstrate that the clusters learned by the pre-trained model are just as important a factor as the supervised training data distribution in determining the accuracy of the finetuned system, which could aid us in selecting the most suitable pre-trained model for the supervised data.

Zosa, Elaine (Silo AI, Finland); Vázquez, Raúl (University of Helsinki, Finland); Tiedemann, Jörg (University of Helsinki, Finland); Segonne, Vincent (Southern Brittany University, France); Vahtola, Teemu (University of Helsinki, Finland); Raganato, Alessandro (University of Milano-Bicocca, Italy); Mickus, Timothee (University of Helsinki, Finland); Apidianaki, Marianna (University of Pennsylvania, USA)

Robin Welsch (Aalto University): The AI Ghostwriter Effect: When Users Do Not Perceive Ownership of AI-Generated Text But Self-Declare as Authors [click for abstract and full list of authors]

Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. Personalization of AI-generated texts did not impact the AI Ghostwriter Effect, and higher levels of participants’ influence on texts increased their sense of ownership. Participants were more likely to attribute ownership to supposedly human ghostwriters than AI ghostwriters, resulting in a higher ownership-authorship discrepancy for human ghostwriters. Rationalizations for authorship in AI ghostwriters and human ghostwriters were similar. We discuss how our findings relate to psychological ownership and human-AI interaction to lay the foundations for adapting authorship frameworks and user interfaces in AI in text-generation tasks.

Posters

Research area is in brackets after the title. See the list of abbreviations »

CI Computational Intelligence
CSO Constraints, Satisfiability, and Optimization
CV Computer Vision
GAME Games and Virtual Environments
HAI Human Aspects in AI
HCI Human Computer Interaction
HEU Heuristic Search
IRF Information Retrieval and Filtering
KRR Knowledge Representation and Reasoning
MAS Agent-based and Multi-agent Systems
ML Machine Learning
MULT Multidisciplinary Topics and Applications
NLP Natural Language Processing
PLAN Planning and Scheduling
ROB Robotics
UAI Uncertainty in AI
XAI Safe, Explainable, and Trustworthy AI

1. Harshit Agrawal (Aalto University/Planmeca Oy): Utilizing U-Net Architectures with Auxiliary Information for Scatter Correction in CBCT Across Different Field-of-View Settings (MULT) [click for abstract and full list of authors]

Agrawal, Harshit (Aalto University, Planmeca Oy.); Hietanen, Ari (Planmeca Oy.); Särkkä, Simo (Aalto University)

One major drawback of CBCT systems is the increase in scatter-to-primary signal ratio due to the use of large flat panel 2D detector and the proximity of detector to the patient. In CBCT systems, the amount of scatter may vary significantly with the size of scanning FOV. The scatter-to-primary ratio increases 4.9-fold for a 17x12 cm FOV in comparison to a 6x6 cm FOV. In modern CBCT systems, the FOV size can be even larger, which will increase the scatter-to-primary ratio even more. While the applicability of deep learning networks has been demonstrated for estimating the scatter for a fixed FOV, no study has been done to evaluate deep learning network's performance with changes in FOV size. Moreover, it remains unclear if a single neural network can generalize to different FOVs. In practice, having a single network for multiple FOVs is important as CBCT systems are used to scan a wide range of FOVs in real clinical settings. In this study we train and evaluate U-Net for the scatter estimation performance on varying FOV sizes. We propose a simple method to train a single network for multiple FOV sizes by providing auxiliary size information to the encoder (Aux-Net), which outperforms the baseline U-Net.

2. Mahsa Asadi (Aalto University): Collaborative Algorithms for Online Personalized Mean Estimation Learning (ML) [click for abstract and full list of authors]

Asadi, Mahsa, Aalto University; Bellet, Aurélien, Inria; Maillard, Odalric-Ambrym, Inria; Tommasi, Marc , Inria

We consider an online estimation problem involving a set of agents. Each agent has access to a (personal) process that generates samples from a real-valued distribution and seeks to estimate its mean. We study the case where some of the distributions have the same mean, and the agents are allowed to actively query information from other agents. The goal is to design an algorithm that enables each agent to improve its mean estimate thanks to communication with other agents. The means as well as the number of distributions with same mean are unknown, which makes the task nontrivial. We introduce a novel collaborative strategy to solve this online personalized mean estimation problem. We analyze its time complexity and introduce variants that enjoy good performance in numerical experiments. We also extend our approach to the setting where clusters of agents with similar means seek to estimate the mean of their cluster.

3. Mikko Aulamo (University of Helsinki): Unsupervised Feature Selection for Effective Parallel Corpus Filtering (NLP) [click for abstract and full list of authors]

Aulamo, Mikko (University of Helsinki); de Gibert, Ona (University of Helsinki); Virpioja, Sami (University of Helsinki); Tiedemann, Jörg (University of Helsinki)

This work presents an unsupervised method of selecting filters and threshold values for the OpusFilter parallel corpus cleaning toolbox. The method clusters sentence pairs into noisy and clean categories and uses the features of the noisy cluster center as filtering parameters. Our approach utilizes feature importance analysis to disregard filters that do not differentiate between clean and noisy data. A randomly sampled subset of a given corpus is used for filter selection and ineffective filters are not run for the full corpus. We use a set of automatic evaluation metrics to assess the quality of translation models trained with data filtered by our method and data filtered with OpusFilters default parameters. The trained models cover English-German and English-Ukrainian in both directions. The proposed method outperforms the default parameters in all translation directions for almost all evaluation metrics.

4. Shoaib Azam (Aalto University): Multimodal Fusion for Sensorimotor Coordination in Steering Angle Prediction (CV), (ML), (ROB) [click for abstract and full list of authors]

Munir Farzeen, Aalto University, Gwangju Institute of Science and Technology; Azam Shoaib, Aalto University;Yow Kin-Choong, University of Regina; Lee Byung-Geun, Gwangju Institute of Science and Technology; Jeon Moongu Gwangju Institute of Science and Technology.

Efficient reasoning about the spatial and temporal structure of the environment is crucial for perception in autonomous driving, particularly in an end-to-end approach. Although different sensor modalities are employed to capture the complex nature of the environment, they each have their limitations. For example, frame-based RGB cameras are susceptible to variations in illumination conditions. However, these limitations at the sensor level can be addressed by complementing them with sensor fusion techniques, enabling the learning of efficient feature representations for end-to-end autonomous perception. In this study, we address the end-to-end perception problem by fusing a frame-based RGB camera with an event camera to improve the learned representation for predicting lateral control. To achieve this, we propose a convolutional encoder-decoder architecture called DRFuser. DRFuser encodes the features from both sensor modalities and leverages self-attention to fuse the frame-based RGB and event camera features in the encoder part. The decoder component unrolls the learned features to predict lateral control, specifically in the form of a steering angle. We extensively evaluate the proposed method on three datasets: our collected Dataset, Davis Driving dataset, and the EventScape dataset for simulation. The results demonstrate the generalization capability of our method on both real-world and simulated datasets. We observe qualitative and quantitative improvements in the performance of the proposed method for predicting lateral control by incorporating the event camera in fusion with the frame-based RGB camera. Notably, our method outperforms state-of-the-art techniques on the Davis Driving Dataset, achieving a $5.6 \%$ improvement in the root mean square error (RMSE) score.

5. Dominik Baumann (Aalto University): Identifying causal structure in dynamical systems (ML) [click for abstract and full list of authors]

Bauman, Dominik (Aalto University); Solowjow, Friedrich (RWTH Aachen University); Johansson, Karl H. (KTH Stockholm); Trimpe, Sebastian (RWTH Aachen University)

Mathematical models are fundamental building blocks in the design of dynamical control systems. As control systems are becoming increasingly complex and networked, approaches for obtaining such models based on first principles reach their limits. Data-driven methods provide an alternative. However, without structural knowledge, these methods are prone to finding spurious correlations in the training data, which can hamper generalization capabilities of the obtained models. This can significantly lower control and prediction performance when the system is exposed to unknown situations. A preceding causal identification can prevent this pitfall. We propose a method that identifies the causal structure of control systems. We design experiments based on the concept of controllability, which provides a systematic way to compute input trajectories that steer the system to specific regions in its state space. We then analyze the resulting data leveraging powerful techniques from causal inference and extend them to control systems. Further, we derive conditions that guarantee the discovery of the true causal structure of the system. Experiments on a robot arm demonstrate reliable causal identification from real-world data and enhanced generalization capabilities.

6. Ayush Bharti (Aalto University): Optimally-weighted Estimators of the Maximum Mean Discrepancy for Likelihood-free Inference (ML) [click for abstract and full list of authors]

Bharti, Ayush (Aalto University); Naslidnyk, Masha (UCL); Key, Oscar (UCL); Kaski, Samuel (Aalto University & University of Manchester); Briol, Francois-Xavier (UCL)

Likelihood-free inference methods typically make use of a distance between simulated and real data. A common example is the maximum mean discrepancy (MMD), which has previously been used for approximate Bayesian computation, minimum distance estimation, generalised Bayesian inference, and within the nonparametric learning framework. The MMD is commonly estimated at a root-$m$ rate, where $m$ is the number of simulated samples. This can lead to significant computational challenges since a large $m$ is required to obtain an accurate estimate, which is crucial for parameter estimation. In this paper, we propose a novel estimator for the MMD with significantly improved sample complexity. The estimator is particularly well suited for computationally expensive smooth simulators with low- to mid-dimensional inputs. This claim is supported through both theoretical results and an extensive simulation study on benchmark simulators.

7. Sebastian Björkqvist (IPRally): Building a Graph-Based Patent Search Engine (IRF), (ML), (NLP) [click for abstract and full list of authors]

Björkqvist, Sebastian (IPRally); Kallio, Juho (IPRally)

Performing prior art searches is an essential step in both patent drafting and invalidation. The task is challenging due to the large number of existing patent documents and the domain knowledge required to analyze the documents. We present a graph-based patent search engine that tries to mimic the work done by a professional patent examiner. Each patent document is converted to a graph that describes the parts of the invention and the relations between the parts. The search engine is powered by a graph neural network that learns to find prior art by using novelty citation data from patent office search reports where citations are compiled by human patent examiners. We show that a graph-based approach is an efficient way to perform searches on technical documents and demonstrate it in the context of patent searching.

8. Klavdiya Bochenina (Computer Science Department, University of Helsinki): Towards sustainable ride-pooling algorithms for autonomous cars: a comparative study of passenger satisfaction, taxi fleet usage and emission metrics (MAS), (MULT), (PLAN) [click for abstract and full list of authors]

Bochenina, Klavdiya (University of Helsinki); Laura Ruotsalainen (University of Helsinki)

Reducing traffic-related CO2 emissions is one of the major factors in achieving city-level carbon neutrality. In this work, we study the problem of sustainable ride-pooling of autonomous cars with SUMO traffic simulation software as an environment for testing ride-pooling strategies and explicit emission modelling. We compare the scenarios without ride-pooling (baseline) and with ride-pooling for varying levels of demand, maximum occupancy of cars and penetration rates of autonomous taxis using three groups of metrics, namely, metrics of passenger satisfaction, taxi fleet usage and total emissions. The experimental study showed that simple heuristics (minimizing detours or maximising the occupancy of the cars) may reduce emissions up to 15% compared to baseline but do not provide the balanced solution for multiple metrics. To overcome this problem, the future work will contain development of simulation-based multi-objective reinforcement learning algorithm for sustainable ride-pooling.

9. Jörg Tiedemann (University of Helsinki): Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging (NLP), (UAI) [click for abstract and full list of authors]

Talman, Aarne (University of Helsinki); Celikkanat, Hande (University of Helsinki); Virpioja, Sami (University of Helsinki); Heinonen, Markus (Aalto University); Tiedemann, Jörg (University of Helsinki)

This paper introduces Bayesian uncertainty modeling using Stochastic Weight Averaging-Gaussian (SWAG) in Natural Language Understanding (NLU) tasks. We apply the approach to standard tasks in natural language inference (NLI) and demonstrate the effectiveness of the method in terms of prediction accuracy and correlation with human annotation disagreements. We argue that the uncertainty representations in SWAG better reflect subjective interpretation and the natural variation that is also present in human language understanding. The results reveal the importance of uncertainty modeling, an often neglected aspect of neural language modeling, in NLU tasks.

10. Paul Chang (Aalto University, Finland; Finnish Center for Artificial Intelligence, Finland; RIKEN Center for AI Project, Japan): Memory-Based Dual Gaussian Processes for Sequential Learning (ML), (UAI)
11. Preetha Datta (Aalto): Constructing Knowledge Graphs with Large Language Models (NLP) [click for abstract and full list of authors]

Datta, Preetha (Aalto University); Chizhikova, Anastasia (Aalto University)

N/A

12. Ona de Gibert Bonet (University of Helsinki): The OPUS-MT Dashboard A Toolkit for a Systematic Evaluation of Open Machine Translation Models (NLP) [click for abstract and full list of authors]

Tiedemann, Jörg (University of Helsinki) De Gibert Bonet, Ona (University of Helsinki)

The OPUS-MT dashboard is a web-based platform that provides a comprehensive overview of open translation models. We focus on a systematic collection of benchmark results with verifiable translation performance and large coverage in terms of languages and domains. We provide results for in-house OPUS-MT and Tatoeba models as well as external models from the Huggingface repository and user-contributed translations. The functionalities of the evaluation tool include summaries of benchmarks for over 2,300 models covering 4,560 language directions and 294 languages, as well as the inspection of predicted translations against their human reference. We focus on centralization, reproducibility and coverage of MT evaluation combined with scalability. The dashboard can be accessed live at https://opus.nlpl.eu/dashboard/.

13. David Diaz-Guerra (Tampere University): Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications (ML) [click for abstract and full list of authors]

Diaz-Guerra, David (Tampere University); Politis, Archontis (Tampere University); Miguel, Antonio (University of Zaragoza, Spain); Beltran, Jose R. (University of Zaragoza, Spain); Virtanen, Tuomas (Tampere University)

Many multi-source localization and tracking models based on neural networks use one or several recurrent layers at their final stages to track the movement of the sources. Conventional recurrent neural networks (RNNs), such as the long short-term memories (LSTMs) or the gated recurrent units (GRUs), take a vector as their input and use another vector to store their state. However, this approach results in the information from all the sources being contained in a single ordered vector, which is not optimal for permutation-invariant problems such as multi-source tracking. In this paper, we present a new recurrent architecture that uses unordered sets to represent both its input and its state and that is invariant to the permutations of the input set and equivariant to the permutations of the state set. Hence, the information of every sound source is represented in an individual embedding and the new estimates are assigned to the tracked trajectories regardless of their order.

14. Karolina Drobotowicz (Aalto University): Practitioners Perspectives on Inclusion and Civic Empowerment in Finnish Public Sector AI (HAI), (HCI), (XAI) [click for abstract and full list of authors]

Drobotowicz, Karolina (Aalto University); Truong, Nghiep Lucy (Aalto University); Ylipulli, Johanna (Aalto University); Torres Gonzalez, Ana Paula (Aalto University); Sawhney, Nitin (Aalto University)

Algorithmic decision-making and big data systems are increasingly being used to provide essential services in the public sector. Such services that utilize AI entail many related risks and responsibilities for citizens and public sector providers. In this empirical study, we examine practitioners attitudes, practices and challenges of implementing inclusive AI services in the public sector that can empower greater civic agency. We conducted in-depth interviews with ten practitioners responsible for managing, developing or designing AI-enabled public services across three big public organizations in Finland in domains relating to the municipality, taxes, and social insurance. The results show that the discussion on inclusion and civic empowerment is just in its beginning in the public sector. Practitioners perceive the concept of inclusion as devising accessible public services for all members of society. The research suggests two distinct socio-cultural constructs emerging among practitioners that may hinder how civic empowerment is manifested in such services: expert cultures and risk-averse cultures. The contributions of the study are twofold. First, we describe the practitioners perspectives on empowerment and inclusion in regard to public sector AI. Further, we recognize how expert and risk-averse cultures among practitioners explain their actions and restraints in devising public sector AI services.

15. Farshad Farahnakian (University of Turku): A Comprehensive Study of Clustering-Based Techniques for Detecting Abnormal Vessel Behavior (CI), (ML) [click for abstract and full list of authors]

Farahnakian Farshad (University of Turku); Nicolas Florent (Helsinki Commission-HELCOM); Farahnakian Fahimeh (University of Turku); Nevalainen Paavo (University of Turku); Sheikh Javad (University of Turku); Heikkonen Jukka (University of Turku); Raduly-Baka Csaba (University of Turku)

Abnormal behavior detection is currently receiving much attention because of the availability of marine equipment and data allowing maritime agents to track vessels. One of the most popular tools for developing an efficient anomaly detection system is the Automatic Identification System (AIS). The aim of this paper is to explore the performance of existing well-known clustering methods for detecting the two most dangerous abnormal behaviors based on the AIS. The methods include K-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Affinity Propagation (AP), and the Gaussian Mixtures Model (GMM). In order to evaluate the performance of the clustering methods, we also used the AIS data of vessels, which were collected through the Finnish transport agency from the whole Baltic Sea for three months. Although most existing studies focus on ocean route recognition, deviations from regulated ocean routes, or irregular speed, we focused on dark ships or those sets of vessels that turn off the AIS to perform illegal activities and spiral vessel movements. The experimental results demonstrate that the K-means clustering method can effectively detect dark ships and spiral vessel movements, which are the most threatening events for maritime safety.

16. Ahmad Farooq (Aalto University): Quantum Linear Regression for Reduced Rank Gaussian Processes (ML), (MULT) [click for abstract and full list of authors]

Authors: Farooq, Ahmad; Galvis-Florez, Cristian Andrey; Särkkä, Simo Affilliation: ELEC Sensor Informatics and Medical Technology, Aalto University, Finland

Gaussian processes inherently captures noise, smoothness parameters, and training data uncertainty. It is widely used model for regression problems in supervised learning. Its computational complexity quickly becomes intractable as the dataset size grows. We propose a quantum linear regression model specifically for sparse Gaussian process regression to overcome this limitation, offering a promising alternative to traditional machine learning methods. We start by encoding the data matrix into a quantum state using a multicontrolled unitary operation, allowing for the efficient representation of random Fourier features crucial for kernel approximation. We then employ quantum principal component analysis and conditional rotation subroutines to extract the spectral decomposition of the kernel matrix. To compute the mean and variance of the posterior Gaussian distribution, we use Hadamard and swap tests. By combining these quantum methodologies, we provide a powerful framework for sparse Gaussian process regression, offering the potential to significantly reduce computational complexity.

17. Mohammad Feli (University of Turku): An energy-efficient semi-supervised approach for on-device photoplethysmogram signal quality assessment (ML), (MULT) [click for abstract and full list of authors]

Feli, Mohammad, Department of Computing, University of Turku; Azimi, Iman, Department of Computer Science, University of California, Irvine; Anzanpour, Arman, Department of Computing, University of Turku; Rahmani, Amir M., Department of Computer Science, University of California, Irvine; Liljeberg, Pasi, Department of Computing, University of Turku;

Photoplethysmogram (PPG) is a non-invasive technique used in wearable devices to measure vital signs (e.g., heart rate). The method is, however, highly susceptible to motion artifacts, which are inevitable in remote health monitoring. Noise reduces signal quality, leading to inaccurate decision-making. In addition, unreliable data collection and transmission waste a massive amount of energy on battery-powered devices. Studies in the literature have proposed PPG signal quality assessment (SQA) enabled by rule-based and machine learning (ML)-based methods. However, rule-based techniques were designed according to certain specifications, resulting in lower accuracy with unseen noise and artifacts. ML methods have mainly been developed to ensure high accuracy without considering execution time and device's energy consumption. In this paper, we propose a lightweight and energy-efficient PPG SQA method enabled by a semi-supervised learning strategy for edge devices. We first extract a wide range of features from PPG and then select the best features in terms of accuracy and latency. Second, we train a one-class support vector machine model to classify PPG signals into Reliable'' andUnreliable.'' We evaluate the proposed method in terms of accuracy, execution time, and energy consumption on two embedded devices, in comparison to five state-of-the-art PPG SQA methods. The methods are assessed using a PPG dataset collected via smartwatches from 46 individuals in free-living conditions. The proposed method outperforms the other methods by achieving an accuracy of 0.97 and a false positive rate of 0.01. It also provides the lowest latency and energy consumption compared to other ML-based methods.

18. Masood Feyzbakhsh Rankooh (Tampere University): Capturing (Optimal) Relaxed Plans with Stable and Supported Models of Logic Programs (HEU), (KRR), (PLAN), (ROB) [click for abstract and full list of authors]

Feyzbakhsh Rankooh, Masood; Janhunen, Tomi

We establish a novel relation between delete-free planning, an important task for the AI Planning community also known as relaxed planning, and logic programming. We show that given a planning problem, all subsets of actions that could be ordered to produce relaxed plans for the problem can be bijectively captured with stable models of a logic program describing the corresponding relaxed planning problem. We also consider the supported model semantics of logic programs, and introduce one causal and one diagnostic encoding of the relaxed planning problem as logic programs, both capturing relaxed plans with their supported models. Our experimental results show that these new encodings can provide major performance gain when computing optimal relaxed plans, with our diagnostic encoding outperforming state-of-the-art approaches to relaxed planning regardless of the given time limit when measured on a wide collection of STRIPS planning benchmarks.

19. Robin Forsberg (University of Helsinki, Centre for Social Data Science): Stakeholder Interests and Value Conflicts in Natural Language Processing: A Systematic Literature Review (NLP), (Other) [click for abstract and full list of authors]

Forsberg, Robin (University of Helsinki); Nelimarkka, Matti (University of Helsinki)

The advent of nascent general-purpose NLP technologies, including ChatGTP, promises productivity gains, less is, however, understood of their societal impacts. At the same time, societal impacts of AI systems are heavily debated in academia and policy making. To understand the harms, stakeholder interests and encoded values in natural language processing are analyzed in this article. We conducted an extensive structural literature review of studies related to natural language processing, values and societal impact, finally selecting 60 text analytics articles published in IEEE and ACM. We build on Freemans stakeholder theory, which posits that organizations should consider the interests and impacts of all stakeholders, not just shareholders, for ethical and long-term success. Our findings suggest that there exists a division between researchers favoring instrumental technoeconomic values, such as productivity and efficiency, over sociopolitical values, such as autonomy and creativity. This gap brings up the question of human automation versus human augmentation. Furthermore, the current relative monoculture of stakeholders involved in NLP research and development calls for further participatory design.

20. Fanni Franssila (University of Helsinki): Graph Representation of the Magnetic Field Topology in High-Fidelity Plasma Simulations for Machine Learning Applications (KRR), (ML), (MULT) [click for abstract and full list of authors]

Bouri, Ioanna (University of Helsinki); Franssila, Fanni (University of Helsinki); Alho, Markku (University of Helsinki); Cozzani, Giulia (University of Helsinki); Zaitsev, Ivan (University of Helsinki); Palmroth, Minna (University of Helsinki); Roos, Teemu (University of Helsinki)

Topological analysis of the magnetic field in simulated plasmas allows the study of various physical phenomena in a wide range of settings. One such application is magnetic reconnection, a phenomenon related to the dynamics of the magnetic field topology, which is difficult to detect and characterize in three dimensions. We propose a scalable pipeline for topological data analysis and spatiotemporal graph representation of three-dimensional magnetic vector fields. We demonstrate our methods on simulations of the Earth's magnetosphere produced by Vlasiator, a supercomputer-scale Vlasov theory-based simulation for near-Earth space. The purpose of this work is to challenge the machine learning community to explore graph-based machine learning approaches to address a largely open scientific problem with wide-ranging potential impact.

21. Javier García Gilabert (Universitat Politècnica de Catalunya): ReSeTOX: Re-learning attention weights for toxicity mitigation in machine translation (NLP) [click for abstract and full list of authors]

Costa-jussà, Marta R. (Meta AI); Escolano, Carlos (Universitat Politècnica de Catalunya)

Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue of Neural Machine Translation (NMT) generating translation outputs that contain toxic words not present in the input. The objective is to mitigate the introduction of toxic language without the need for re-training. In the case of identified added toxicity during the inference process, ReSeTOX dynamically adjusts the key-value self-attention weights and re-evaluates the beam search hypotheses. Experimental results demonstrate that ReSeTOX achieves a remarkable 57% reduction in added toxicity while maintaining an average translation quality of 99.5% across 164 languages.

22. Yaroslav Getman (Aalto University): Multi-task wav2vec2 Serving as a Pronunciation Training System for Children (NLP) [click for abstract and full list of authors]

Getman, Yaroslav (Aalto University); Al-Ghezi, Ragheb (Aalto University); Grósz, Tamás (Aalto University); Kurimo, Mikko (Aalto University)

For the speed of real-time games and language learning tools, it is important to reduce the size and number of large speech models that need to be executed to give immediate feedback for the users. We propose to use multi-task learning to train a single speech model that can simultaneously generate transcript and assess the pronunciation of child speakers. The performance is demonstrated for Swedish children with speech sound disorder and child second language learners of Finnish.

23. Ana Paula Gonzalez Torres (Aalto University): Emerging AI Discourses and Policies in the EU: Implications for Evolving AI Governance (HAI) [click for abstract and full list of authors]

Kajava, Kaisla; Sawhney, Nitin

With the emergence of powerful generative AI technologies and the increasing prevalence of high-risk AI systems in society, the conversation around the regulation of AI has gained critical global traction. Meanwhile, policymakers and regulators are struggling to stay on top of technological advances in AI. Our work uses a transdisciplinary approach combining computational linguistics, law, and sociology to examine the developments in policy discourses around the AI Act (AIA) in the European Union (EU) and their implications globally. We base our analysis on findings from an ongoing study of multi-stakeholder feedback to the AI Act, leveraging Natural Language Inference (NLI) to examine the language and discourse of justifications by diverse stakeholder groups. Based on the outcomes, we take a policy analysis perspective to examine how the initial discourse has affected the AI regulatory framework in the amendments presented during the EU co-legislative process. The analysis is anchored on identified trends of contentious points in the regulation, such as the definition of AI, general principles, prohibited practices, a tiered approach to foundation models, general purpose and generative AI, high-risk categorization, and measures supporting innovation, such as AI regulatory sandboxes. Finally, we reflect on the emerging discourses, regulatory policies, and experimental framework implications for global AI governance. Our take is that experimental regulation, such as international regulatory sandboxes, can offer a window of opportunity to incorporate a global perspective and produce better-informed regulation and governance of AI.

24. Ajinkya Gorad (Aalto University): Rao-Blackwellized Monte Carlo data association with deep metric for object tracking (CV), (ML) [click for abstract and full list of authors]

Gorad, Ajinkya (Aalto University); Särkkä, Simo (Aalto University)

We propose a deep Rao-Blackwellized Monte Carlo data association particle filter (DeepRBMCDA) which is a modification of the existing RBMCDA using Hungarian association. It uses YOLOv7 detected bounding box with deep Re-Identification (ReID) descriptors to track detected objects in Bayesian way. In our work, we demonstrate our performance on a diverse GMOT-40 dataset which contains sequences of varying class objects of similar appearance. We evaluate our tracker and compare its performance with state-of-the-art trackers. We obtain comparable multi object tracking accuracy (MOTA), multi object tracking precision (MOTP), localization accuracy (LocA), and multi object detection accuracy (MODA), improved mostly tracked (MT), reduced mostly lost (ML), and lowest fragmentation (Frag). We also perform the ablation study which reports highest higher order tracking accuracy (HOTA), HOTA combined LocA (HOTALocA), MOTA, identity switching (IDSW), MT, ML, Frag, and identity based F1 score (IDF1) on tracking ground-truth labels. Using particle filter for object tracking provides robustness which can be helpful in diverse dynamic tracking scenarios.

25. Tamas Grosz (Aalto University): Discovering Relevant Sub-spaces of BERT, Wav2Vec 2.0, ELECTRA and ViT Embeddings for Humor and Mimicked Emotion Recognition with Integrated Gradients (ML), (NLP), (XAI) [click for abstract and full list of authors]

Grosz, Tamas (Aalto University); Virkkunen, Anja (Aalto University); Porjazovski, Dejan (Aalto University); Kurimo, Mikko (Aalto University)

Large-scale, pre-trained models revolutionized the field of sentiment analysis and enabled multimodal systems to be quickly developed. In this paper, we address two challenges posed by the Multimodal Sentiment Analysis (MuSe) 2023 competition by focusing on automatically detecting cross-cultural humor and predicting three continuous emotion targets from user-generated videos. Multiple methods in the literature already demonstrate the importance of embedded features generated by popular pre-trained neural solutions. Based on their success, we can assume that the embedded space consists of several sub-spaces relevant to different tasks. Our aim is to automatically identify the task-specific sub-spaces of various embeddings by interpreting the baseline neural models. Once the relevant dimensions are located, we train a new model using only those features, which leads to similar or slightly better results with a considerably smaller and faster model. The best Humor Detection model using only the relevant sub-space of audio embeddings contained approximately 54% fewer parameters than the one processing the whole encoded vector, required 48% less time to be trained and even outperformed the larger model. Our empirical results validate that, indeed, only a portion of the embedding space is needed to achieve good performance. Our solution could be considered a novel form of knowledge distillation, which enables new ways of transferring knowledge from one model into another.

26. Riku Haapaniemi (Tampere University): Form and meaning in AI language generation: Analysis of an audio captioning model from a Translation Studies perspective (HAI) [click for abstract and full list of authors]

Riku Haapaniemi (Tampere University); Annamaria Mesaros (Tampere University); Manu Harju (Tampere University); Irene Martín Morató (Tampere University); Maija Hirvonen (Tampere University)

This study proposes the novel combination of a theoretical viewpoint founded in Translation Studies (TS) with the text production processes of artificial intelligence (AI) systems, in this case a natural language processing (NLP) model utilised for audio captioning (AC). TS theories derived from notions of semiotics and materiality are used in the analysis of captions produced by an AC system developed by Tampere Universitys GUIDE research project. The purpose of this analysis is to compare the importance of meaning-construction in human text production processes to the similarly central role of form-based statistical computation in AI text production processes. Analyses of example captions show that the basic assumption that humans construct meaning while AI does not is both a tenable basis for this kind of analysis and a useful tool in articulating connections as well as differences between these two processes. The implications of this analysis have the potential to feed back to both fields. For TS, this discussion contributes to ongoing discourse about the role of AI in translation theory and shows how AI processes can be discussed within a TS framework while still acknowledging the complexities of human translation processes. For AI studies, this framework likewise enables a nuanced conceptualisation of what meaning is in human translation processes and why that makes for a substantial difference from comparable AI text production processes. In general, this study suggests multiple fruitful points of convergence and divergence between TS and AI studies.

27. Maria Hartikainen (Tampere University): Towards a Human-Centred Artificial Intelligence Maturity Model (HAI), (HCI), (XAI) [click for abstract and full list of authors]

Maria Hartikainen Tampere University, Faculty of Information Technology and Communication Sciences, unit of Computing Sciences Kaisa Väänänen Tampere University, Faculty of Information Technology and Communication Sciences, unit of Computing Sciences Thomas Olsson Tampere University, Faculty of Information Technology and Communication Sciences, unit of Computing Sciences

Artificial intelligence (AI) is becoming a central building block of computational systems. Following the long traditions of human- centered design, Human-Centered AI (HCAI) emphasises the im- portance of putting humans and various societal considerations in the centre of the development. However, the question is: how to realise HCAI when designing systems that utilise novel com- putational tools and require consideration of increasingly broad set of requirements, spanning from fairness and transparency to accountability and ethics? The purpose of our study is to support the AI development practices in companies in order for the humans to have AI solutions that are efficient, trustworthy, and safe. To this end, we propose a maturity model for HCAI (HCAI-MM). In this paper we present the first phase of the model development, in which the central building blocks of HCAI are specified and initial company requirements for the models structure and content are evaluated with four AI developers

28. Iikka Hauhio (University of Helsinki): The Spectrum of Unpredictability and its Relation to Creative Autonomy (HAI), (ML), (UAI) [click for abstract and full list of authors]

Hauhio, Iikka (University of Helsinki); Kantosalo, Anna (University of Helsinki); Linkola, Simo (University of Helsinki); Toivonen, Hannu (University of Helsinki)

Recent popularity of generative AI tools has sparked discussion on how the unpredictability of the tools affects the creativity of the human and the AI program alike, as unpredictability prevents the human user from fully controlling the output. We present a framework for categorizing unpredictability on four different dimensions and analyze the types of unpredictability found in generative AI tools. We also describe the relationship between unpredictability, uncontrollability, and Jennings' creative autonomy. We conclude that while unpredictability does not on its own imply creative autonomy, it could be used as a central condition for it, if accompanied by other conditions.

29. Jue Hou (University of Helsinki): Effects of sub-word segmentation on the performance of transformer language models (ML), (NLP) [click for abstract and full list of authors]

Hou, Jue (University of Helsinki); Katinskaia, Anisia (University of Helsinki); Vu, Anh-Duc (University of Helsinki); Yangarber, Roman (University of Helsinki);

Language modeling is a fundamental task in natural language processing, which has been thoroughly explored with various architectures and hyperparameters. However, few studies focus on the effect of sub-word segmentation on the performance of language models (LMs). In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation---Morfessor and StateMorph. We train the models for languages with very rich morphology---Finnish and Russian, and compare their performance with different segmentation algorithms, vocabulary sizes, and model sizes. The results show that Morfessor and StateMorph segmentation allows the LMs to 1. converge more efficiently in terms of training time, 2. achieve lower perplexity, and 3. achieve equivalent or better validation scores on downstream tasks. Finally, we show 4. that LMs of smaller size using morphological segmentation achieve performance that is comparable to models of larger size trained with BPE---both in terms of (2) perplexity and (3) scores on downstream tasks. Points 1 and 4 have potential impacts on sustainability since they reduce the models' cost, and while 1 reduces cost only in the training phase, 4 does so also in the inference phase.

30. Ti John (Aalto University): Temporal Causal Mediation through a Point Process: Direct and Indirect Effects of Healthcare Interventions (ML) [click for abstract and full list of authors]

Hızlı, Çağlar (Aalto University); John, Ti (Aalto University); Juuti, Anne (Helsinki University Hospital and University of Helsinki); Saarinen, Tuure (Helsinki University Hospital and University of Helsinki); Pietiläinen, Kirsi (Helsinki University Hospital and University of Helsinki); Marttinen, Pekka (Aalto University)

Deciding on an appropriate intervention requires a causal model of a treatment, the outcome, and potential mediators. Causal mediation analysis lets us distinguish between direct and indirect effects of the intervention, but has mostly been studied in a static setting. In healthcare, data come in the form of complex, irregularly sampled time-series, with dynamic interdependencies between a treatment, outcomes, and mediators across time. Existing approaches to dynamic causal mediation analysis are limited to regular measurement intervals, simple parametric models, and disregard long-range mediatoroutcome interactions. To address these limitations, we propose a non-parametric mediatoroutcome model where the mediator is assumed to be a temporal point process that interacts with the outcome process. With this model, we estimate the direct and indirect effects of an external intervention on the outcome, showing how each of these affects the whole future trajectory. We demonstrate on semi-synthetic data that our method can accurately estimate direct and indirect effects. On real-world healthcare data, our model infers clinically meaningful direct and indirect effect trajectories for blood glucose after a surgery.

31. Kaisla Kajava (Aalto University): Justifying AI and its Regulation: Examining Multi-Stakeholder Feedback on the AI Act (MULT), (NLP), (Other) [click for abstract and full list of authors]

Kajava, Kaisla (Aalto University); Gonzalez Torres, Ana Paula (Aalto University); Rannisto, Antti (Aalto University); Sakai, Shintaro (Nagoya University)

The rapid uptake of algorithmic systems across sectors reflects in discourse around AI. The AI Act (AIA) regulation proposal, introduced by the European Commission in 2021 (EC, 2021), is one narrative turn, introducing a legal dimension to the debate and prompting reactions from diverse stakeholders. The value-laden AI discourse reflected in stakeholder responses is structured around the justification of diverse perspectives and arguments. How do stakeholders from different sectors justify their views on the AIA and on the use of AI? How do they frame key issues around the regulation, development, and use of AI? Additionally, with the recent proliferation of large generative AI technologies, we examine perspectives on these general-purpose models, which had not yet been explicitly considered by regulators, but were already discussed by several stakeholders.

The language around AI and its regulation reflects and impacts entire AI ecosystems. The regulatory discourse is unfolding as the AIA is yet to be implemented. Our research examines justifications in reaction to the first draft of the AIA, building ground for ongoing work on the role of discourses and deliberation in the evolution of the regulation from proposal to law. We examine justifications in multi-stakeholder feedback to the AIA, drawing attention to discourse as a factor in revealing and shaping perspectives on AI and a bridge between regulators and affected stakeholders. Drawing from the theoretical framework of social justifications (Boltanski and Thévenot, 2006), based on worlds of justification, we analyze areas of contestation and similarity between stakeholders from the technology industry, academia, non-governmental organizations, and the public sector. We manually label a subset of data, and scale up the analysis using a pre-trained zero-shot Natural Language Inference (NLI) model using a proxy task classifying sentences into entailing a challenge or a benefit. Evaluating against the labeled set, the model captures sentences containing a justification with a 88.5% recall. This allows assisted reading of over 100 documents, extending our analysis.

Our findings show that the most common justifications across stakeholders were tied between the civic world, reflecting values of fairness and human rights, and the industrial world, emphasizing efficiency and scientific validity. Stakeholders tend to agree that certain uses of AI technologies are adverse to human rights and should be avoided and prohibited. However, especially in the case of general-purpose models, the distribution of agency in decision-making and deployment appears to be a value-based and a regulatory challenge between balancing innovation and economic benefit with civic safety and responsible AI governance. Thematics which were to become central deliberative issues in the regulation of generative and general-purpose models seemed contested in the discourse already before the generative AI boom.

32. Noa Kallioinen (Aalto University): priorsense: Intuitive and efficient Bayesian prior sensitivity analysis (ML) [click for abstract and full list of authors]

Noa Kallioinen (Aalto Unviersity); Topi Paananen (Aalto University); Paul-Christian Bürkner( TU Dortmund University) Aki Vehtari (Aalto University).

Determining the sensitivity of the posterior to perturbations of the prior and likelihood is an important part of the Bayesian workflow. We introduce a practical and computationally efficient sensitivity analysis approach using importance sampling to estimate properties of posteriors resulting from power-scaling the prior or likelihood. On this basis, we suggest a diagnostic that can indicate the presence of prior-data conflict or likelihood noninformativity and discuss strengths and limitations of the power-scaling approach. The approach can be easily included in modern MCMC-based Bayesian workflows with minimal effort by the model builder. We present an implementation in our new R package priorsense, along with examples and guidelines for use.

33. Antero Karvonen (VTT): Theory languages in designing artificial intelligence (MULT), (Other) [click for abstract and full list of authors]

Saariluoma, Pertti; Karvonen, Antero

The foundations of AI design discourse are worth analyzing. Here, attention is paid to the nature of theory languages used in designing new AI technologies because the limits of these languages can clarify some fundamental questions in the development of AI. We discuss three types of theory language used in designing AI products: formal, computational, and natural. Formal languages, such as mathematics, logic, and programming languages, have fixed meanings and no actual-world semantics. They are context- and practically content-free. Computational languages use terms referring to the actual world, i.e., to entities, events, and thoughts. Thus, computational languages have actual-world references and semantics. They are thus no longer context- or content-free. However, computational languages always have fixed meanings and, for this reason, limited domains of reference. Finally, unlike formal and computational languages, natural languages are creative, dynamic, and productive. Consequently, they can refer to an unlimited number of objects and their attributes in an unlimited number of domains. The differences between the three theory languages enable us to reflect on the traditional problems of strong and weak AI.

34. Anisia Katinskaia (University of Helsinki): Grammatical Error Correction for Sentence-level Assessment in Language Learning (NLP) [click for abstract and full list of authors]

Anisia Katinskaia, University of Helsinki; Roman Yangarber, University of Helsinki

The paper presents experiments on using a Grammatical Error Correction (GEC) model to assess the correctness of answers that language learners give to grammar exercises. We hypothesize that a GEC model corrects only errors and leaves correct answers unchanged. We perform a test on assessing learner answers in a real but constrained language-learning setup: the learners answer only fill-in-the-blank and multiple-choice exercises. For this purpose, we use ReLCo, a publicly available manually annotated learner dataset in Russian (Katinskaia et al., 2022). In this experiment, we fine-tune a large-scale T5 language model for the GEC task and estimate its performance on the RULEC-GEC dataset (Rozovskaya et al., 2019) to compare with top-performing models. We also release a manually checked, cleaner version of the RULEC-GEC test set. Our analysis shows that the GEC model performs reasonably well in detecting erroneous answers to grammar exercises and potentially can be used for best-performing error types in a real learning setup. However, it struggles to assess answers which were tagged by human annotators as alternative-correct using the aforementioned hypothesis. This is in large part due to a still low recall in correcting errors, and the fact that the GEC model may modify even correct words---it may generate plausible alternatives, which are hard to evaluate against the gold-standard reference.

35. Kianoosh Kazemi (University of Turku): Robust CNN-based Respiration Rate Estimation for Smartwatch PPG and IMU (ML), (MULT) [click for abstract and full list of authors]

Kazemi, Kianoosh (University of Turku); Azimi, Iman (University of California, Irvine); Liljebrg, Pasi (University of Turku); Rahmani, Amir M. (University of California, Irvine);

Respiratory rate (RR) serves as an indicator of various medical conditions, such as cardiovascular diseases and sleep disorders. Several studies have employed signal processing and machine learning techniques to extract RR from biosignals, such as photoplethysmogram (PPG). These RR estimation methods were mostly designed for finger-based PPG collected from subjects in stationary situations (e.g., in hospitals). In contrast to finger-based PPG signals, wristbased PPG are more susceptible to noise, particularly in their low frequency range, which includes respiratory information. Therefore, the existing methods struggle to accurately extract RR when PPG data are collected from wrist area under free-living conditions. The increasing popularity of smartwatches, equipped with various sensors including PPG, has prompted the need for a robust RR estimation method. In this paper, we propose a convolutional neural network-based approach to extract RR from PPG, accelerometer, and gyroscope signals captured via smartwatches. Our method, including a dilated residual inception module and 1D convolutions, extract the temporal information from the signals, enabling RR estimation. Our method is trained and tested using data collected from 36 subjects under free-living conditions for one day using Samsung Gear Sport watches. For evaluation, we compare the proposed method with four state-of-the-art RR estimation methods. The RR estimates are compared with RR references obtained from a chestband device. The results show that our method outperforms the existing methods with the Mean-Absolute-Error and Root-Mean-Square-Error of 1.85 and 2.34, while the best results obtained by the other methods are 2.41 and 3.29, respectively. Moreover, compared to the other methods, the absolute error distribution of our method was narrow (with the lowest median), indicating a higher level of agreement between the estimated and reference RR values.

36. Marcus Klasson (Aalto University): Investigating Uncertainty Estimation in Neural Radiance Fields for Novel View Synthesis (CV), (ML), (UAI) [click for abstract and full list of authors]

Klasson, Marcus (Aalto University, FCAI); Mereu, Riccardo (Aalto University); Kannala, Juho (Aalto University); Solin, Arno (Aalto University)

Neural Radiance Fields (NeRFs) are deep learning models capable of synthesizing realistic images from novel camera views in real-world scenes. However, NeRFs are unable to estimate the uncertainty about the rendered images, which hinders their deployment in safety-critical applications. We focus on two types of uncertainties that are desired in NeRFs: the model should i) be uncertain about unseen and occluded scene parts, and ii) be capable of recognizing outliers that can appear across time. Previous approaches for uncertainty estimation in NeRFs require architecture changes or multiple forward passes through the network, which results in a trade-off between slower rendering time versus better uncertainty estimates. To address these issues, we propose using the Laplace approximation to obtain post-hoc uncertainties from the model during novel view synthesis. This approach only requires a single forward pass to estimate the uncertainty during rendering and can be applied to any pre-trained NeRF without changing the architecture. We assess the capability of our proposed method to capture the various uncertainties in both static and dynamically changing scenes. Furthermore, we compare against previous uncertainty estimation methods for NeRFs to demonstrate the benefits of the post-hoc uncertainties for novel view synthesis.

37. Narasimharao Kowlagi (University of Oulu): A STRONGER BASELINE FOR AUTOMATIC PFIRRMANN GRADING OF LUMBAR SPINE MRI USING DEEP LEARNING (CV) [click for abstract and full list of authors]

Kowlagi, Narasimharao (Research Unit of Health Sciences and Technology, University of Oulu);Nguyen, Huy Hoang (Research Unit of Health Sciences and Technology, University of Oulu);McSweeney, Terence (Research Unit of Health Sciences and Technology, University of Oulu); Saarakkala, Simo (Research Unit of Health Sciences and Technology, University of Oulu);Määttä, Juhani (Research Unit of Health Sciences and Technology, University of Oulu);Karppinen, Jaro (Research Unit of Health Sciences and Technology, University of Oulu), Tiulpin, Aleksei (Research Unit of Health Sciences and Technology, University of Oulu)

We present a well-tuned convolutional neural network (CNN) that outperformed transformer-based architectures for lumbar spine MRI grading using Pfirrmann grades. We conducted ablation studies and evaluated different architectures for segmentation and classification tasks. The results were compared to the model architectures used in prior research trained on the same data. The proposed pipeline consisted of three stages: semantic segmentation, localization, and classification. The chosen segmentation model used the Feature Pyramid Network architecture with ResNet34 as the encoder. Our model achieved a mean intersection over union (mIoU) of 97.5 ± 0.1 over five random seeds. The classification model used EfficientNetB0 architecture with input image shape 224x224x4, i.e., 4 images from the mid-sagittal region are stacked and fed to the network as channels. The model was trained on the majority vote labels provided by three raters for the Pfirrmann grades. The base classification model achieved a balanced accuracy of 75.5%. We observed an improvement of 2.8% in balanced accuracy when the base model was trained on the consensus labels. Further refinement of 1.8% was observed when the model was trained on ImageNet weights. Finally, using image augmentations and test time augmentations, the balanced accuracy was improved to 81.7%. We also found our model to perform 4.48% better than the current state-of-the-art model, which used a transformer-based architecture.

38. Samuel Kujala (Tampere University): ARTISTIC APPLICATION OF AR FILTERS IN THE ACTORS ATTUNEMENT (HCI), (MULT), (Other)
39. Mikko Kurimo (Aalto University): New data, benchmark and baseline for L2 speaking assessment for low-resource languages (NLP) [click for abstract and full list of authors]

Kurimo, Mikko (Aalto University); Getman, Yaroslav (Aalto University); Voskoboinik, Ekaterina (Aalto University); Al-Ghezi, Ragheb (Aalto University); Kallio, Heini (University of Jyväskylä); Kuronen, Mikko (University of Jyväskylä); von Zansen, Anna (University of Helsinki); Hilden, Raili (University of Helsinki); Kronholm, Sirkku (University of Jyväskylä); Huhta, Ari (University of Jyväskylä); Linden, Krister (University of Helsinki)

The development of large multilingual speech models provides the possibility to construct high-quality speech technology even for low-resource languages. In this paper, we present the speech data of L2 learners of Finnish and Finland Swedish that we have recently collected for training and evaluation of automatic speech recognition (ASR) and speaking assessment (ASA). It includes over 4000 recordings by over 300 students per language in short read-aloud and free-form tasks. The recordings have been manually transcribed and assessed for pronunciation, fluency, range, accuracy, task achievement, and a holistic proficiency level. We present also an ASR and ASA benchmarking setup we have constructed using this data and include results from our baseline systems built by fine-tuning a self-supervised multilingual model for the target language. In addition to benchmarking, our baseline system can be used by L2 students and teachers for online self-training and evaluation of oral proficiency.

40. Mietta Lennes (University of Helsinki, Kielipankki): Pitch distributions in a very large corpus of spontaneous Finnish speech (NLP) [click for abstract and full list of authors]

Lennes, Mietta (University of Helsinki); Toivola, Minnaleena (University of Helsinki)

Due to their physiological properties and individual habits, some people tend to speak in a high-pitched voice, whereas others use lower pitch. A number of studies have suggested that voice pitch differs between genders, languages and speakers of different ages. However, the findings have been partly contradictory and often based on small datasets. On the other hand, people constantly vary their pitch during speech production. The local pitch changes serve many tasks in spoken language and interaction. In order to model the use of pitch by different speakers, we need to understand the individual variation in voice range and to develop more reliable methods for analysing pitch in ways that reflect human auditory perception. The Donate Speech Corpus, available via the Language Bank of Finland for restricted use, provides a great resource for studying pitch variability. The corpus contains spontaneous speech from a large number of different speakers of Finnish as well as metadata that can be used for comparing groups of speakers representing different ages and genders. The present study provides a summary of the typical pitch of over 8000 speakers of Finnish in the Donate Speech Corpus.

41. Chengkun Li (University of Helsinki): PyVBMC: Efficient Bayesian inference in Python (ML), (UAI) [click for abstract and full list of authors]

Bobby Huggins; Chengkun Li; Marlon Tobaben; Mikko J. Aarnos; Luigi Acerbi

PyVBMC is a Python software supported by FCAI for posterior and model inference for black-box computational models. The software implements the Variational Bayesian Monte Carlo (VBMC) algorithm for efficient parameter estimation and model assessment when model evaluations are mildly-to-very expensive (e.g., a second or more) and/or noisy.

PyVBMC can be applied to any computational or statistical model with up to roughly 10-15 continuous parameters, with the only requirement that the user can provide a Python function that computes the target log likelihood of the model, or an approximation thereof (e.g., an estimate of the likelihood obtained via simulation or Monte Carlo methods). PyVBMC is particularly effective when the model takes more than about a second per evaluation, with dramatic speed-ups of 1-2 orders of magnitude when compared to traditional approximate inference methods.

Extensive benchmarks on both artificial test problems and a large number of real models from the computational sciences, particularly computational and cognitive neuroscience, show that VBMC generally and often vastly outperforms alternative methods for sample-efficient Bayesian inference, and is applicable to both exact and simulator-based models. PyVBMC brings this state-of-the-art inference algorithm to Python, along with an easy-to-use Pythonic interface for running the algorithm and manipulating and visualizing its results.

42. Rui Li (Aalto University): Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models (ML), (UAI) [click for abstract and full list of authors]

Li, Rui (Aalto University); John, ST (Aalto University); Solin, Arno (Aalto University)

Approximate inference in Gaussian process (GP) models with non-conjugate likelihoods gets entangled with the learning of the model hyperparameters. We improve hyperparameter learning in GP models and focus on the interplay between variational inference (VI) and the learning target. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, we show that a direct approximation of the marginal likelihood as in Expectation Propagation (EP) is a better learning objective for hyperparameter optimization. We design a hybrid training procedure to bring the best of both worlds: it leverages conjugate-computation VI for inference and uses an EP-like marginal likelihood approximation for hyperparameter learning. We compare VI, EP, Laplace approximation, and our proposed training procedure and empirically demonstrate the effectiveness of our proposal across a wide range of data sets.

43. Blerta Lindqvist (Aalto University): Symmetry Defense Against CNN Adversarial Perturbation Attacks (ML) [click for abstract and full list of authors]

Lindqvist, Blerta (Aalto University)

This paper uses symmetry to make Convolutional Neural Network classifiers (CNNs) robust against adversarial perturbation attacks. Such attacks add perturbation to original images to generate adversarial images that fool classifiers such as road sign classifiers of autonomous vehicles. Although symmetry is a pervasive aspect of the natural world, CNNs are unable to handle symmetry well. For example, a CNN can classify an image differently from its mirror image. For an adversarial image that misclassifies with a wrong label, CNN inability to handle symmetry means that a symmetric adversarial image can classify differently from the wrong label. Further than that, we find that the classification of a symmetric adversarial image reverts to the correct label. To classify an image when adversaries are unaware of the defense, we apply symmetry to the image and use the classification label of the symmetric image. To classify an image when adversaries are aware of the defense, we use mirror symmetry and pixel inversion symmetry to form a symmetry group. We apply all the group symmetries to the image and decide on the output label based on the agreement of any two of the classification labels of the symmetry images. Adaptive attacks fail because they need to rely on loss functions that use conflicting CNN output values for symmetric images. Without attack knowledge, the proposed symmetry defense succeeds against both gradient-based and random-search attacks, with up to near-default accuracies for ImageNet. The defense even improves the classification accuracy of original images.

44. Zhi-Song Liu (Lappeenranta-Lahti University of Technology LUT): Name your style, text-guided artistic style transfer (CV) [click for abstract and full list of authors]

Wang, Li-Wen (The Hong Kong Polytechnic University); Siu, Wan-Chi (The Hong Kong Polytechnic University ); Kalogeiton Vicky (Ecole Polytechnique, IP Paris)

Image style transfer has attracted widespread attention in the past years. Despite its remarkable results, it requires additional style images available as references, making it less flexible and inconvenient. Using text is the most natural way to describe the style. Text can describe implicit abstract styles, like styles of specific artists or art movements. In this work, we propose a text-driven style transfer (TxST) that leverages advanced image-text encoders to control arbitrary style transfer. We introduce a contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description. To this end, we also propose a novel cross-attention module to fuse style and content features. Finally, we achieve an arbitrary artist-aware style transfer to learn and transfer specific artistic characters such as Picasso, oil painting, or a rough sketch. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods. Moreover, it can mimic the styles of one or many artists to achieve attractive results, thus highlighting a promising future direction.

45. Qing Liu (CMVS, University of Oulu): Automated lesion segmentation in fundus images with many-to-many reassembly of features (MULT) [click for abstract and full list of authors]

Liu, Qing; Liu, Haotian; Ke, Wei; Liang, Yixiong

N/A

46. Eero Lumme (University of Helsinki): A Systems-Theoretic Approach for the Analysis of AI Ethics (HCI), (XAI) [click for abstract and full list of authors]

Lumme, Eero (University of Helsinki)

This poster explores the ways in which ideas, theories, and techniques from the realm of safety research can be utilized for the analysis of ethical aspects of artificial intelligence. Decades of dedicated effort have been invested in the domains of aviation, healthcare, and nuclear industry to enhance accident investigation and prevention methodologies. Safety-related issues can be considered as a subset of a significantly larger and more multidimensional group of ethical issues. Ultimately, both safety and ethical reflections are about avoiding unwanted consequences. A systems-theoretic causal model is one of the many theoretical approaches used in safety research. Systems-theoretic thinking that expands linear reasoning originates from engineering. By applying the systems-theoretic causal model it is possible to create causal mechanisms for explaining empirical sociotechnical phenomena. This makes it possible to perceive different types of interaction mechanisms at the system level and to compare different situations and system structures with each other. A potential benefit that systems-theoretic modelling of ethical problems offers is its ability to combine the terminology of engineering, moral philosophy and social sciences into the same framework. This allows for the development of new interdisciplinary research methods for the systematic analysis and design of ethical AI systems.

47. Ananth Mahadevan (University of Helsinki): Cost-Aware Retraining for Machine Learning (ML), (Other) [click for abstract and full list of authors]

Mahadevan, Ananth; Mathioudakis, Michael

Retraining a machine learning (ML) model is essential for maintaining its performance as the data change over time. However, retraining is also costly, as it typically requires re-processing the entire dataset. As a result, a trade-off arises: on the one hand, retraining an ML model too frequently incurs unnecessary computing costs; on the other hand, not retraining frequently enough leads to stale ML models and incurs a cost in loss of accuracy. To resolve this trade-off, we envision ML systems that make automated and cost-optimal decisions about when to retrain an ML model. In this work, we study the decision problem of whether to retrain or keep an existing ML model based on the data, the model, and the predictive queries answered by the model. Crucially, we consider the costs associated with each decision and aim to optimize the trade-off. Our main contribution is a Cost-Aware Retraining Algorithm, Cara, which optimizes the trade-off over streams of data and queries. To explore the performance of Cara, we first analyze synthetic datasets and demonstrate that Cara can adapt to different data drifts and retraining costs while performing similarly to an optimal retrospective algorithm. Subsequently, we experiment with real-world datasets and demonstrate that Cara has better accuracy than drift detection baselines while making fewer retraining decisions, thus incurring lower total costs.

48. Irene Martin (Tampere University): Training sound event detection with soft labels from crowdsourced annotations (ML) [click for abstract and full list of authors]

Martin-Morato, Irene, Tampere University; Harju, Manu, Tampere University; Ahokas, Paul, Tampere University; Mesaros, Annamaria, Tampere University;

In this paper, we study the use of soft labels to train a system for sound event detection (SED). Soft labels can result from annotations which account for human uncertainty about categories, or emerge as a natural representation of multiple opinions in annotation. Converting annotations to hard labels results in unambiguous categories for training, at the cost of losing the details about the labels distribution. This work investigates how soft labels can be used, and what benefits they bring in training a SED system. The results show that the system is capable of learning information about the activity of the sounds which is reflected in the soft labels and is able to detect sounds that are missed in the typical binary target training setup. We also release a new dataset produced through crowdsourcing, containing temporally strong labels for sound events in real-life recordings, with both soft and hard labels.

49. Anssi Moisio (Aalto University): Evaluating Morphological Generalisation in Machine Translation by Distribution-Based Compositionality Assessment (NLP) [click for abstract and full list of authors]

Moisio, Anssi (Aalto University); Creutz, Mathias (University of Helsinki); Kurimo, Mikko (Aalto University)

Compositional generalisation refers to the ability to understand and generate a potentially infinite number of novel meanings using a finite group of known primitives and a set of rules to combine them. The degree to which artificial neural networks can learn this ability is an open question. Recently, some evaluation methods and benchmarks have been proposed to test compositional generalisation, but not many have focused on the morphological level of language. We propose an application of the previously developed distribution-based compositionality assessment method to assess morphological generalisation in NLP tasks, such as machine translation or paraphrase detection. We demonstrate the use of our method by comparing translation systems with different BPE vocabulary sizes. The evaluation method we propose suggests that small vocabularies help with morphological generalisation in NMT.

50. Hee-Seung Moon (Aalto University): Real-time 3D Target Inference via Biomechanical Simulation (HCI) [click for abstract and full list of authors]

Moon, Hee-Seung (Aalto University); Liao, Yi-Chi (Aalto University); Li, Chenyu (Aalto University); Lee, Byungjoo (Yonsei University, Korea); Oulasvirta, Antti (Aalto University)

Selecting a target in a 3D environment is often challenging, especially with small/distant targets or when sensor noise is high. To facilitate selection, target-inference methods must be accurate, fast, and sensitive to variability in human movement. However, traditional data-free approaches fall short in accuracy since they ignore variability. While data-driven solutions achieve higher accuracy, they rely on extensive human datasets so prove costly, time-consuming, and transfer poorly. In this paper, we propose a novel approach that leverages biomechanical simulation to produce synthetic motion data, capturing a variety of movement-related factors, e.g., limb configurations and motor noise. Then, an inference model is trained with only the simulated data. This simulation-based approach improves transfer and lowers cost; developers can easily produce variety-rich data in large quantities for different scenarios. Our method accurately infers intended targets in 3D pointing conditions within milliseconds, reducing users' target-selection error by 71% and completion time by 35%.

51. Manjunath Mulimani (Tampere University): Incremental Learning of Acoustic Scenes and Sound Events (CI), (CSO), (ML), (Other) [click for abstract and full list of authors]

Mulimani, Manjunath (Tampere University); Mesaros, Annamaria (Tampere University)

N/A

52. Rajdeep Kumar Nath (VTT, Finland): Foundation Modeling Approach for the Design of Multi-Functional ECG Smart Systems (ML)
53. Elio Nushi (University of Helsinki): A Bayesian method for time reannotation of transcriptomics data (ML), (Other) [click for abstract and full list of authors]

Nushi, Elio (University of Helsinki); Douillard, Francois (University of Helsinki); Selby, Katja (University of Helsinki); Honkela, Antti (University of Helsinki); Lindström, Miia (University of Helsinki)

Background: Transcriptomics experiments are often performed to capture changes in gene expression over time. However, time annotations may be missing, imprecise, or not reflect the same biological phase. Assigning accurate time points to these experiments by using a reference model is crucial for identifying differentially expressed genes, and understanding gene regulatory networks in order to elucidate the studied organisms physiology and life cycle. Method: In this study, we propose a Bayesian approach based on Gaussian process regression modeling to address this challenge. We employ this method to perform time annotation in legacy Clostridium botulinum microarray experiments, which were initially annotated based on growth phases, utilizing recently collected RNA-Seq time series data comprising multiple replicates as reference. We also test the performance of the method on RNA-Seq data by using the experiments collected on even time points as the training set and the rest as the validation set. Furthermore, we assess the methods robustness to measurement errors by applying it to synthetically generated data with varying levels of noise. Results: By reassigning the growth phases to the microarray experiments based on the new time annotations, incorporating heuristic knowledge about the expected time interval for each growth phase, we significantly enhance the description of the microarray data. Notably, the improved annotation allows for clear separation of experiments belonging to different growth phases, as demonstrated by principal component analysis (PCA). Consequently, we successfully identify experiments that appear likely to have been initially assigned to incorrect growth phases.

54. Bruno Oliveira Cattelan (University of Helsinki): A Machine Learning Approach for Fast Binding Energy Estimation of Vacancies to Screw Dislocations (ML), (UAI) [click for abstract and full list of authors]

Oliveira Cattelan, Bruno (University of Helsinki); Lindblad, Victor (University of Helsinki); Granberg, Fredric (University of Helsinki)

Material research plays an important role in fusion. From deciding which material can withstand the high temperatures divertor components face to the best wall material to minimise damage and dust in case of irradiation effects. We focus on vacancy defect binding energies to screw dislocation. Traditional techniques such as density functional theory can for some configurations give the answer. However, in order to study the binding energy when vacancies are present, a combinatorial amount of cases need to be analysed, which quickly becomes infeasible.

To combat this issue, we present a neural network solution. From a subset of cases we can train a model, which in turn can predict the energy in a fraction of the time. We make use of state of the art techniques such as Dropout and BatchNormalization to ensure the best possible performance of our model. However, due to the nature of the problem we have to deal with a large amount of uncertainty. This is addressed by using uncertainty quantification techniques such as deep ensembles and mixture density networks. We show that our solution has a mean validation error around 5.7%, with whole datasets predicted in a matter of minutes.

55. Sachith Pai (University of Helsinki): WaZI: A Learned andWorkload-aware Z-Index (IRF), (MULT) [click for abstract and full list of authors]

Pai, Sachith (University of Helsinki); Mathioudakis, Michael (University of Helsinki); Wang, Yanhao (East China Normal University)

Learned indexes fit machine learning(ML) models to the data and use them to make query operations more time and space-efficient. Recent works propose using learned spatial indexes to improve spatial query performance by optimizing the storage layout or internal search structures according to the data distribution. However, only few learned indexes exploit the query workload distribution to enhance their performance. In addition, building learned spatial indexes is often costly on large datasets due to the inefficiency of training ML models.

We present WaZI, a learned and workload-aware variant of a Z-index, which jointly optimizes storage layout and search structures, as a viable solution for the above challenges of spatial indexing. Specifically, we first formulate a cost function to measure the performance of a Z-index on a dataset for a range-query workload. Then, we optimize the Z-index structure by minimizing the cost function through adaptive partitioning and ordering for index construction. Moreover, we design a novel page-skipping mechanism to improve its query performance by reducing access to irrelevant data pages. Our extensive experiments show that our WaZI index improves range query time by 40% on average over the baselines, while always performing better or comparably to state-of-the-art spatial indexes. Additionally, WaZI maintains good point query performance while providing favourable construction time and index size tradeoffs.

56. Shantipriya Parida (Silo AI): LLM Fine-Tuning for Low Resource Language: A Case Study on Odia (NLP) [click for abstract and full list of authors]

Parida, Shantipriya(Silo AI, Finland); Kohli, Guneet Singh (Thapar University, India); Sekhar, Sambit (Odia Generative AI, India); Shahid, SK (Silicon Institute of Technology, India); Pradhan, Subham (Silicon Institute of Technology, India); Asif, Aisha (KIIT University, India); Nair, Nipun B (Amrita Vishwa Vidyapeetham, India); Dash, Satya Ranjan (KIIT University, India)

Large Language Models (LLMs) are significantly impacting the AI community and the advent of ChatGPT, Bard, and GPT-4 has led to rethinking the possibilities of artificial general intelligence (AGI). However, most of the LLMs are trained in English and other high-resource languages resulting in the unavailability of LLM and its related technologies and services for many low-resource languages. In India, where only 10-15% of the population is proficient in English- English, the need for LLM models adapted to regional languages become crucial. In this paper, we propose an instruction-tuned LLaMA model for the low-resource Odia lan- guage by utilizing the Odia instruction set. Also, we quantized the model for easy infer- ence on low-resource setup. We proposed a benchmarked dataset for Odia LLM evaluation- tion. The developed instruction-tuned Odia LLM will be available freely for research and non-commercial purposes.

57. Nhan Phan (Aalto University): CaptainA - A mobile app for practising Finnish pronunciation (HCI), (ML), (MULT), (NLP) [click for abstract and full list of authors]

Phan, Nhan (Aalto University); Grósz Tamás (Aalto University); Kurimo Mikko (Aalto University)

Learning a new language is often difficult, especially practising it independently. The main issue with self-study is the absence of accurate feedback from a teacher, which would enable students to learn unfamiliar languages. In recent years, with advances in Artificial Intelligence and Automatic Speech Recognition, it has become possible to build applications that can provide valuable feedback on the users' pronunciation. In this paper, we introduce the CaptainA app explicitly developed to aid students in practising their Finnish pronunciation on handheld devices. Our app is a valuable resource for immigrants who are busy with school or work, and it helps them integrate faster into society. Furthermore, by providing this service for L2 speakers and collecting their data, we can continuously improve our system and provide better aid in the future.

The primary function of the CaptainA app is based on our mispronunciation detection and diagnosis model. Furthermore, we utilise various deep learning technologies, including Finnish text-to-speech synthesis, large language model, and image generation, to enhance the app's functionalities. As a result, CaptainA demonstrates the broader application of Artificial Intelligence in learning the Finnish language.

58. Antti Pihlajamäki (University of Jyväskylä): Graphs and Kernelized Learning Applied to Interactions of Hydrogen with Doped Gold Nanoparticle Electrocatalysts (ML), (MULT) [click for abstract and full list of authors]

Pihlajamäki, Antti (Department of Physics, Nanoscience Center, University of Jyväskylä); Malola, Sami (Department of Physics, Nanoscience Center, University of Jyväskylä); Kärkkäinen, Tommi (Faculty of Information Technology, University of Jyväskylä); Häkkinen, Hannu (Departments of Physics and Chemistry, Nanoscience Center, University of Jyväskylä

Production of hydrogen gas is done via hydrogen evolution reaction (HER), which requires suitable catalysts to lower the energy barrier. The first step of any catalytic reaction is to adsorp target atom or molecule onto the catalyst. Understanding this adsorption enables one to design materials, which have optimal properties for the catalytic reaction. Here, we studied how hydrogen atom interacts with doped 25-atom nanoparticles using graph representations and various kernel-based machine learning (ML) methods. The methods were trained to predict interaction energies based on state-of-the-art density functional theory calculations and they managed to reach RMSE as low as 0.1 eV. The method construction allowed us to further analyze the importance of different features/properties and the range of interactions meaningful for the predictions.

59. Dejan Porjazovski (Aalto University): Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference (NLP) [click for abstract and full list of authors]

Porjazovski,Dejan (Aalto University);Getman,Yaroslav (Aalto University);Grosz,Tamas (Aalto University);Kurimo,Mikko (Aalto University)

Large pre-trained models are essential in paralinguistic systems, demonstrating effectiveness in tasks like emotion recognition and stuttering detection. In this paper, we employ large pre-trained models for the ACM Multimedia Computational Paralinguistics Challenge, addressing the Requests and Emotion Share tasks. We explore audio-only and hybrid solutions leveraging audio and text modalities. Our empirical results consistently show the superiority of the hybrid approaches over the audio-only models. Moreover, we introduce a Bayesian layer as an alternative to the standard linear output layer. The multimodal fusion approach achieves an 85.4% UAR on HC-Requests and 60.2% on HC-Complaints. The ensemble model for the Emotion Share task yields the best 𝜌 value of .614. The Bayesian wav2vec2 approach, explored in this study, allows us to easily build ensembles, at the cost of fine-tuning only one model. Moreover, we can have usable confidence values instead of the usual overconfident posterior probabilities.

60. Dejan Porjazovski (Aalto University): TOPIC IDENTIFICATION FOR SPONTANEOUS SPEECH: ENRICHING AUDIO FEATURES WITH EMBEDDED LINGUISTIC INFORMATION (NLP) [click for abstract and full list of authors]

Porjazovski,Dejan (Aalto University);Grosz,Tamas (Aalto University);Kurimo,Mikko (Aalto University)

Traditional topic identification solutions from audio rely on an automatic speech recognition system (ASR) to produce transcripts used as input to a text-based model. These approaches work well in high-resource scenarios, where there are sufficient data to train both components of the pipeline. However, in low-resource situations, the ASR system, even if available, produces low-quality transcripts, leading to a bad text-based classifier. Moreover, spontaneous speech containing hesitations can further degrade the performance of the ASR model. In this paper, we investigate alternatives to the standard text-only solutions by comparing audio-only and hybrid techniques of jointly utilising text and audio features. The models evaluated on spontaneous Finnish speech demonstrate that purely audio-based solutions are a viable option when ASR components are not available, while the hybrid multi-modal solutions achieve the best results.

61. Onur Poyraz (Aalto University): Mixture of Coupled HMMs: Robust Modeling of Multivariate Healthcare Time Series (ML), (MULT), (XAI), (UAI) [click for abstract and full list of authors]

Poyraz, Onur (Aalto University) ; Marttinen, Pekka (Aalto University)

Analysis of multivariate healthcare time series data is inherently challenging: irregular sampling, noisy and missing values, and heterogeneous patient groups with different dynamics violating exchangeability. In addition, interpretability and quantification of uncertainty are critically important. Here, we propose a novel class of models, a mixture of coupled hidden Markov models (M-CHMM), and demonstrate how it elegantly overcomes these challenges. To make the model learning feasible, we derive two algorithms to sample the sequences of the latent variables in the CHMM: samplers based on (i) particle filtering and (ii) factorized approximation. Compared to existing inference methods, our algorithms are computationally tractable, improve mixing, and allow for likelihood estimation, which is necessary to learn the mixture model. Experiments on challenging real-world epidemiological and semi-synthetic data demonstrate the advantages of the M-CHMM: improved data fit, capacity to efficiently handle missing and noisy measurements, improved prediction accuracy, and ability to identify interpretable subsets in the data.

62. Aini Putkonen (Aalto University): Modeling Touch-based Menu Selection Performance of Blind Users via Reinforcement Learning (HAI), (HCI) [click for abstract and full list of authors]

Li, Zhi (Stony Brook University); Ko, Yu-Jung (Stony Brook University); Putkonen, Aini (Aalto University); Feiz, Shirin (Stony Brook University); Ashok, Vikas (Old Dominion University); Ramakrishnan, IV (Stony Brook University); Oulasvirta, Antti (Aalto University); Bi, Xiaojun (Stony Brook University)

N/A

63. Mikko Raatikainen (University of Helsinki): Systematic Literature Review on Cost-efficient Deep Learning (ML), (MULT), (Other) [click for abstract and full list of authors]

Klemetti, Antti (University of Helsinki); Raatikainen, Mikko (University of Helsinki); Myllyaho, Lalli (University of Helsinki); Mikkonen, Tommi(University of Jyväskylä); Nurminen, Jukka K. (University of Helsinki)

Cloud computing and deep learning, the recent trends in the software industry, have enabled small companies to scale their business up rapidly. However, this growth is not without a cost deep learning models are related to the heaviest workloads in cloud data centers. When the business grows, the monetary cost of deep learning in the cloud also grows fast. Deep learning practitioners should be prepared and equipped to limit the growing cost. We emphasize monetary cost instead of computational cost although often the same methods decrease both types of cost. We performed a systematic literature review on the methods to control the cost of deep learning. Our library search resulted in 16,066 papers from three article databases, IEEE Xplore, ACM Digital Library, and Scopus. We narrowed them down to 112 papers that we categorized and summarized. We found that: 1) Optimizing inference has raised more interest than optimizing training. Widely used deep learning libraries already support inference optimization methods, such as quantization, pruning, and teacher-student. 2) The research has been centered around image inputs, and there seems to be a research gap for other types of inputs. 3) The research has been hardware-oriented, and the most typical approach to control the cost of deep learning is based on algorithm-hardware co-design. 4) Offloading some of the processing to client devices is gaining interest and can potentially reduce the monetary cost of deep learning.

64. Ossi Räisä (University of Helsinki): On Consistent Bayesian Inference from Synthetic Data (XAI), Uncertainty in AI (UAI) [click for abstract and full list of authors]

Räisä, Ossi (University of Helsinki); Jälkö, Joonas (University of Helsinki); Honkela, Antti (University of Helsinki)

Generating synthetic data, with or without differential privacy, has attracted significant attention as a potential solution to the dilemma between making data easily available, and the privacy of data subjects. Several works have shown that consistency of downstream analyses from synthetic data, including accurate uncertainty estimation, requires accounting for the synthetic data generation. There are very few methods of doing so, most of them for frequentist analysis. In this paper, we study how to perform consistent Bayesian inference from synthetic data. We prove that mixing posterior samples obtained separately from multiple large synthetic datasets converges to the posterior of the downstream analysis under standard regularity conditions when the analysts model is compatible with the data providers model. We show experimentally that this works in practice, unlocking consistent Bayesian inference from synthetic data while reusing existing downstream analysis methods.

65. Kalle Raunio (VTT): Feature Estimation for Punching ToolWear at the Edge (IRF), (ML), (XAI), (Other) [click for abstract and full list of authors]

Junttila, Jukka; Raunio, Kalle; Kokkonen Petteri; Saarela, Olli

As a fast and inexpensive machining method applicable for creating a wide range of shapes and producing large batches, sheet metal punching is widely used e.g., in automotive, aerospace, electronics, and construction industries. A significant downside of sheet metal punching is the punching tool wear in use. A worn punch tool may impact the quality of the end product by causing imperfections and reduce the efficiency of the manufacturing process through increased scrap and by slowing down the production. Effective monitoring of punching tool wear is therefore essential for an efficient and cost-effective production of high-quality parts. The monitoring can be based on acceleration measurement which produces large amounts of raw data, making edge processing ideal as only the indication of the tool condition needs to be sent forward for decision support. Classification models for tool wear identification were built and compared in this study. The models are based on measured acceleration data. Two different open-source methods for time series feature extraction, namely TSFEL and MiniRocket, were tested and the classification results based on them compared. All methods used for building the models are computationally light and therefore applicable for real-time data processing at the edge. According to the results the MiniRocket algorithm is suitable for the task and superior compared to the TSFEL method. The classification accuracies based on the MiniRocket features are at best over 96.5 % and at worst around 84 %, whereas the corresponding accuracies are between 35 and 56 % for TSFEL feature based models. The use of the MiniRocket algorithm in building a model for punch tool monitoring shows very promising results. However, the dataset used was very limited. Therefore, further investigation is required based on an ampler dataset.

66. Siiri Rautio (University of Helsinki): EIT reconstruction using virtual X-rays and machine learning (ML) [click for abstract and full list of authors]

Rautio, Siiri (University of Helsinki); Alsaker, Melody (Gonzaga University); Moura, Fernando (Federal University of ABC); Agnelli, Juan Pablo (National University of Cordoba); Murthy, Rashmi (University of Cambridge); Lassas, Matti (University of Helsinki); Mueller, Jennifer (Colorado State University); Siltanen, Samuli (University of Helsinki)

In electrical impedance tomography (EIT), the aim is to recover the unknown conductivity of a target by injecting currents and measuring boundary voltages through electrodes. We introduce a new reconstruction algorithm for EIT, which provides a connection between EIT and traditional X-ray tomography, based on the idea of "virtual X-rays". We divide the exponentially ill-posed and nonlinear inverse problem of EIT into separate steps. We start by mathematically calculating so-called virtual X-ray projection data from the measurement data. Then we perform explicit algebraic operations and one-dimensional integration, ending up with a blurry and nonlinearly transformed Radon sinogram. We use a neural network to learn the nonlinear deconvolution-like operation. Finally, we can compute a reconstruction of the conductivity using the inverse Radon transform. We demonstrate the method with both simulated and experimental data.

67. Anna Elisabeth Riha (Aalto University): Understanding multiple models in Bayesian workflows with multiverse analysis and iterative filtering (HAI), (ML), (MULT), (XAI), (UAI) [click for abstract and full list of authors]

Riha, Anna Elisabeth (Aalto University); Siccha, Nikolas (Aalto University); Oulasvirta, Antti (Aalto University); Vehtari, Aki (Aalto University)

When using statistical models and data to analyse a phenomenon of interest, different possible modelling choices induce explicit or implicit sets of candidate models. Missing assessment of alternative models or opaque modelling choices can lead to a chosen model that is at risk of being unrepresentative of the true complexity of the underlying processes. Multiverse analysis is a framework for creating multiple models at once based on combinations of sensible modelling choices with the aim of increasing transparency in model building. Models in a multiverse can appear equally plausible without additional assessment of the validity and relevance of models. This can overwhelm and lead to misinterpretation, especially in scenarios with large numbers of models. Starting from multiverse analysis and advances in Bayesian modelling workflows, this work introduces iterative filtering as a procedure to support model building. The presented approach allows to transparently assess the validity of models, compare modelling choices and conclusions across models, and ultimately identify smaller sets of useful models. To further illustrate the suggested procedures, we summarise different scenarios encountered in real world examples and show how a joint consideration of predictive performance and computational checks can support model building.

68. Severi Rissanen (Aalto University): Improving Discrete Diffusion Models via Structured Preferential Generation (ML), (NLP) [click for abstract and full list of authors]

Rissanen, Severi, Aalto University; Heinonen, Markus, Aalto University; Solin, Arno, Aalto University

In the domains of image and audio, diffusion models have shown impressive performance. However, their application to discrete data types, such as language, has often been suboptimal compared to autoregressive generative models. This paper tackles the challenge of improving discrete diffusion models by introducing a structured forward process that leverages the inherent information hierarchy in discrete categories, such as words in text. Our approach biases the generative process to produce certain categories before others, resulting in a notable improvement in log-likelihood scores. This work paves the way for more advanced discrete diffusion models with potentially significant enhancements in performance.

69. Frankie Robertson (University of Jyväskylä): ComputerAdaptiveTesting.jl: Fast, flexible, extensible CATs in Julia (CI), (HCI), (MULT), (UAI)
70. David Rosson (University of Helsinki): Reception Reader: Exploring Text Reuse in Early Modern British Publications (HCI), (IRF), (MULT), (NLP) [click for abstract and full list of authors]

Rosson, David; Mäkelä, Eetu; Vaara, Ville; Mahadevan, Ananth; Ryan, Yann; Tolonen, Mikko

The Computational History research group applied a modified version of the Basic Local Alignment Search Tool (BLAST) algorithma technique originally designed for comparing protein sequencesto the detection of textual overlaps in large corpora of 18th-century documents, and extracted a vast number of connections between authors and books, providing unprecedented insight into the intellectual history of the Enlightenment period. The poster introduces the research infrastructure that helps humanities scholars explore this large dataset.

71. Seppo Ruotsalainen (LUT, DIF, Start-up Foundation): Status of AI Deployment in Enterprises - A Systematic Literature Research (MULT) [click for abstract and full list of authors]

Ruotsalainen Seppo LUT; Hokkanen Päivi LUT; Porras Jari LUT; Kuivalainen Olli LUT

Artificial Intelligence (AI) is gaining growing attention within companies, research, governments, societies, and media. Despite vast and growing investments, only a tiny fraction of organizations report achieving tangible results. Management needs to understand the whole deployment process. In other words, how to set objectives, organize resources, measure results, and learn to gain concrete payback for the time, money, and effort spent. To fill this gap and establish a baseline for companies, we conducted a Systematic Literature Research (SLR) on the status of AI deployment in enterprises. Following the SLR methodology, we searched, screened, analyzed, and synthesized 92 peer-reviewed articles for objectives, approaches, results, and learnings. The present research is, to our knowledge, the first one concerning concrete objectives and results of AI deployment in enterprises. Detailed analysis of the articles confirms that AI is still at the pre-deployment stage in companies. Numerous pre-deployment projects are going on in various industries and company functions with the potential to leverage in the future. Developing the capability to deploy AI in companies is much more challenging and takes more time, concerted effort, and learning than generally thought. Through the results and the developed conceptual framework (the Pre-deployment Circle), this research increases understanding of the deployment status and emphasizes the pre-deployment stages' importance from theoretical and practical viewpoints. Firms, and other organizations, can use the framework, the results, and the information in the appendices to generate ideas, compare their approaches to the reviewed literature, and set realistic objectives to achieve concrete results in their deployment efforts. More real-life case research is needed, existing adoption theories verified, and potential new ones formulated to deploy AI knowledgeably for business and stakeholder benefits.

Keywords: Artificial Intelligence, Strategy, Deployment, Objectives, Results, Learnings

72. FABRICE SAFFRE (VTT): Cooperative Exploration Strategies for a Swarm of Autonomous Robots with Limited Communication (MAS), (MULT), (PLAN), (ROB) [click for abstract and full list of authors]

Saffre, Fabrice (VTT); Hildmann, Hanno (TNO); Karvonen, Hannu (VTT)

The problem of finding an optimal cooperative search or exploration pattern for a group of mobile robots has been the subject of much research over several decades and is related (but not identical) to the classical multiple traveling salesmen (mTSP) and multiple vehicle routing (mVRP) problems. However, strong assumptions tend to be made about several key parameters (e.g., single point of origin) and availability of accurate information before or during exploration. A centralised command structure is also often taken for granted. Such a well-behaved problem, characterised by exhaustive information about all the relevant parameters and constraints, cannot always be expected (for instance in a disaster relief effort or in a military operation) leaving autonomous local decision-making based on incomplete information as the only available option. We investigate the performance of a swarm of autonomous robots using a simple, bio-inspired combination of biased random walk and individual memory in an exploration scenario and compare it with that of an optimal search pattern (contiguous lawnmowers). We then augment the purely local decision-making function with an ability to share information opportunistically between co-located agents. The performance increase from this limited cooperative ability is investigated for variable population densities (or information percolation rate).

73. Fatemeh Sarhaddi (University of Turku): Maternal Social Loneliness Detection Using Passive Sensing Through Continuous Monitoring in Everyday Settings (MULT) [click for abstract and full list of authors]

Sarhaddi, Fatemeh, University of Turku; Azimi, Iman, University of California,Irvine; Axelin, Anna, University of Turku; Niela-Vilen, Hannakaisa, University of Turku; Liljeberg, Pasi, University of Turku; Rahmani, Amir M., University of California,Irvine

Maternal loneliness is associated with adverse physical and mental health outcomes for both the mother and her child. Detecting maternal loneliness noninvasively through wearable devices and passive sensing provides opportunities to prevent or reduce the impact of loneliness on the health and well-being of the mother and her child. In this study, we aim to use objective health data collected passively by smartwatches to predict maternal (social) loneliness during pregnancy and the postpartum period and identify the important objective physiological parameters in loneliness detection. Our results show the potential benefit and feasibility of using passive sensing with a smartwatch to predict maternal loneliness. Our developed machine learning models achieved a high F1-score for loneliness prediction. We also show that intensity of activity, activity pattern, and resting HR and HRV are good predictors of loneliness. These results indicate the intervention opportunities made available by wearable devices and predictive models to improve maternal well-being through early detection of loneliness.

74. Rafael Savvides (University of Helsinki): Model selection with bootstrap validation (ML) [click for abstract and full list of authors]

Savvides, Rafael (University of Helsinki); Mäkelä, Jarmo (CSC); Puolamäki, Kai (University of Helsinki)

Model selection is one of the most central tasks in supervised learning. Validation set methods are the standard way to accomplish this task: models are trained on training data, and the model with the smallest loss on the validation data is selected. However, it is generally not obvious how much validation data is required to make a reliable selection, which is essential when labeled data are scarce or expensive. We propose a bootstrap-based algorithm, bootstrap validation (BSV), that uses the bootstrap to adjust the validation set size and to find the best-performing model within a tolerance parameter specified by the user. We find that BSV works well in practice and can be used as a drop-in replacement for validation set methods or k-fold cross-validation. The main advantage of BSV is that less validation data is typically needed, so more data can be used to train the model, resulting in better approximations and efficient use of validation data.

75. Nitin Sawhney (Aalto University): Role of Regulatory Sandboxes and MLOps for AIEnabled Public Sector Services (HAI) [click for abstract and full list of authors]

Gonzalez Torres, Ana Paula

This paper discusses how innovations in public sector AI-based services must comply with the Artificial Intelligence Act (AI Act) regulatory frameworks while enabling experimentation and participation of diverse stakeholders throughout the Artificial Intelligence (AI) lifecycle. The paper examines the implications of the emerging regulation, AI regulatory sandboxes and Machine Learning Operations (MLOps) as tools that facilitate compliance while enabling co-learning and active participation of multiple stakeholders. We propose a framework that fosters experimentation with automation pipelines and continuous monitoring for the deployment of future public sector AI-based services in a regulatory-compliant and technically innovative manner. AI regulatory sandboxes can be beneficial as a space for contained experimentation that goes beyond regulatory considerations to specific experimentation with the implementation of ML frameworks. While the paper presents a framework based on emerging regulations, tools and practices pertaining to the responsible use of AI, this must be validated through pilot experimentation with public and private stakeholders and regulators in different areas of high-risk AI-based services.

76. Aidan Scannell (Aalto University): Neural Networks as Dual Gaussian Processes for Sequential Learning (ML), (UAI) [click for abstract and full list of authors]

Aidan Scannell, Riccardo Mereu, Paul Chang, Ella Tamir, Joni Pajarinen, Arno Solin

Deep neural networks are known to lack uncertainty estimates, struggle to incorporate new data, and suffer from catastrophic forgetting. We present a method that mitigates these issues by converting neural networks from weight-space to function-space, via a dual parameterisation. Importantly, the dual parameterisation enables us to formulate a sparse GP that captures the joint distribution over the entire data set. This offers a compact and principled way of capturing uncertainty and enables us to incorporate new data without retraining whilst retaining predictive performance. We demonstrate the proposed approach for quantifying uncertainty in supervised learning, maintaining an expressive functional representation for continual learning and uncertainty-guided exploration for model-based reinforcement learning.

77. Sumita Sharma (University of Oulu): Age against the machine: Youth AI Ambassadors promoting Critical AI Literacy (HCI) [click for abstract and full list of authors]

Sharma, Sumita; Durall Gazulla, Eva; Hirvonen, Noora; Kinnula, Marianne; Huttunen, Aira; Lao, Yucong; Niaz, Yusra; Norouzi, Behnaz; Nygård, Tuula; Taskinen, Erja; Alamettälä, Tuulikki; Haarjärvi, Tuure; Hartikainen, Heidi; Jylhä, Ville; Ventä-Olkkonen, Leena. University of Oulu, Finland

As we interact with Artificial Intelligence (AI) in our everyday lives, there are growing concerns regarding its potential for discrimination, bias, and harm and the lack of diversity, inclusiveness, and accessibility to mitigate such harms. It is imperative to critically discuss the ethical and societal implications and impact of AI systems, especially with the youth, who grow up with these tools and will enter a workforce rampant with AI-based tools. To this end, we developed and conducted a two-week AI Ambassadors program in Oulu in June 2023 with the aim - to create a sustainable, scalable, and social initiative for imparting critical AI literacy to/with the youth. This was achieved by creating a program which is replicated every summer with city cummer workers (15-17 year olds), which is thus youth-led. They actively shared their learnings and insights from the program on social media (Instagram @aiambassadors). In this program (http://interact.oulu.fi/ai-ambassador), the youth ambassadors were introduced to generative AI, emotional AI, critical data literacy, sustainable AI, and AI in social media, focusing on how AI impacts their everyday lives. They also explored fabrication technologies in the city centers FabLab. Working in small teams, they created content and materials for sharing information on AI and its impact on the youth, including themselves, their friends, and their peers. In the ongoing academic year, the AI Youth Ambassadors will conduct one event, such as a presentation or performance, at their school to share their learning about AI's impact on youth. Feedback on the program was positive, with several ambassadors asking to continue on through analysing data, writing papers, and being advisors for the next years program. Through our work, we demonstrate a sustainable, scalable, and social program for critical AI literacy for youth - one in which the youth is ready to take charge and make an impact on the world as ambassadors for change!

78. Danqing Shi (Aalto University): Simulating Touchscreen Typing Behavior via Computational Rationality (HCI) [click for abstract and full list of authors]

Shi, Danqing (Aalto University); Zhu, Yujun (Aalto University); Jokinen, Jussi (University of Jyväskylä); Acharya, Aditya (University of Birmingham); Putkonen, Aini (Aalto University); Zhai, Shumin (Google); Oulasvirta, Antti (Aalto University);

N/A

79. Nikolas Siccha (Department of Computer Science, Aalto University): A novel warm-up for HMC based samplers (ML), (UAI), (Other) [click for abstract and full list of authors]

Siccha, Nikolas (Department of Computer Science, Aalto University, Finland)

Routinely warming-up HMC based samplers across increasing amounts of data and varying discretizations and monitoring the intermediate posteriors (1) automatically avoids insignificant modes, (2) facilities automatic nonlinear reparametrizations, (3) enables on the fly adaptation of discretizations, (4) rejects very bad models quickly, (5) plays nice with other improvements to warm-up and (6) can increase computational efficiency, especially for more challenging posteriors.

80. Janne Siipola (University of Helsinki): Generalization error in the Deep Ritz Method with smooth activation functions (ML), (Other) [click for abstract and full list of authors]

Siipola, Janne (University of Helsinki)

Deep Ritz method is a deep learning paradigm to solve partial differential equations. In this article we study the generalization error of the Deep Ritz method. We focus on the quintessential problem which is the Poisson's equation. We show that generalization error of the Deep Ritz method converges to zero with rate $\frac{C}{\sqrt{n}}$, and we give a concrete value for the constant $C$. Results are obtained for shallow and residual neural networks with smooth activation functions.

81. Mikko Tahkola (VTT Technical Research Centre of Finland Ltd): Active learning-based surrogate modeling for accelerated alloy composition screening (MULT) [click for abstract and full list of authors]

Tahkola, Mikko (VTT); Linnala, Lassi (VTT); Pinomaa Tatu (VTT); Savukoski, Simon (VTT); Vajragupta Napat (VTT); Laukkanen, Anssi (VTT)

The design of new alloy compositions is a time-consuming process when the number of design variables is high, the associated physics-based models are computationally heavy, and alloys are evaluated against multiple criteria.

To accelerate the exploration of the large composition candidate space consisting of up to millions of compositions made, we have first employed active learning to build an ML-based CALPHAD surrogate model in a data-efficient way. Although the development of the surrogate model takes time, the trained model is very efficient to use.

A surrogate model-based alloy screening tool is then set up to quickly evaluate the performance of composition candidates against criteria that directs the search towards alloys suitable for high-temperature applications. The most promising candidates are validated with the physics-based model, followed by further ranking based on computing local sensitivity and properties that are linked to the performance of the alloy.

1200 compositions out of 660 000 candidates were identified with the surrogate model that was 180 000 times faster than its physics-based counterpart. Validation with CALPHAD and filtering based on sensitivity reduced the number of compositions to 247, enabling further evaluation with more detailed and computationally expensive first principles methods before experimental validation.

82. Ella Tamir (Aalto University): Transport with Support: Data-Conditional Diffusion Bridges (ML), (UAI) [click for abstract and full list of authors]

Tamir, Ella (Aalto University); Trapp, Martin (Aalto University); Solin, Arno (Aalto University)

N/A

83. Jyrki Tervo (VTT Technical Research Centre of Finland): Journal bearing status identification with acoustic emission measurements and data clustering (ML), (MULT), (XAI), (Other) [click for abstract and full list of authors]

Tervo, Jyrki (VTT); Junttila, Jukka (VTT); Ronkainen, Helena (VTT)

The laboratory scale journal bearing lubrication regimes were analysed with wide band acoustic emission (AE) measurements. Data analysis was supported by data-based clustering of AE data. The approach can be effectively used to reveal fundamental lubrication modes, i.e., hydrodynamic (HL), mixed (ML) and boundary (BL) lubrication as a function of Hersey number. Besides AE the other parameters monitored were friction torque, bearing temperature, loading, sliding velocity and oil pressure. The materials used in the experiments were case-hardened 18CrNiMo7-6 steel and nitrided 42CrMo7 steel. The tests were lubricated with synthetic extreme-pressure gear oil (SGN 320) and the bearing temperature was kept constant during the tests. The bearing pressure and sliding velocity during tests were varied in the wide range resulting in different lubrication situations. The acoustic emission signals power and frequency content was analysed, and essential features were extracted for data clustering. For lubrication regime change identification the parameters such as signal RMS and coefficient of variation (CV) proved to be important, while signal kurtosis showed to be the most sensitive in discovering anomalies. The sensitivity requires data filtering to remove erroneous peaks. It is also interesting to notice the changes in AE frequency due to different lubrication situation. In literature different clustering and classification methods has been proposed and applied for journal bearing status identification. Here the selected unsupervised clustering method was the mean-shift clustering due to fact, that the lubrication regimes in the Stribeck curve form an inseparable continuum. The algorithm does not require specifying the number of clusters in advance, i.e., the clusters are determined by the algorithm with respect to the data.

84. Jaakko Tervonen (VTT Technical Research Centre of Finland): Cold-Start Model Adaptation: Evaluation of Short Baseline Calibration MULT) [click for abstract and full list of authors]

Tervonen, Jaakko (VTT Technical Research Centre of Finland); Nath, Rajdeep Kumar (VTT Technical Research Centre of Finland); Petterson, Kati (VTT Technical Research Centre of Finland); Närväinen, Johanna (VTT Technical Research Centre of Finland); Mäntyjärvi, Jani (VTT Technical Research Centre of Finland)

In adaptive distributed systems, a cold-start occurs when a new client joins the network and model performance degrades in the early phase of the data stream. Handling such a situation carefully is especially important when clients are sensing devices measuring humans. Like their experiences, behavior, and nature, human physiology and reactions to external stimuli also differ between individuals. Researchers have developed strategies to adapt to these differences, but the adaptations generally require a large amount of data from each individual beforehand, which is not available in a cold-start. To address this, current study proposes user calibration which uses short segments of easily obtainable baseline data to adapt to new individuals. Experiments were conducted on two public stress and affect detection datasets, WESAD and SWELL-KW, to assess the effectiveness of the proposed calibration method and to determine suitable duration of the baseline measurement. Results showed that user calibration always beat the non-personalized model and segments of 3-8 minutes seemed to be most promising to consider for future use.

85. Shaoxiong Ji (University of Helsinki): HPLT: High Performance Language Technologies (NLP) [click for abstract and full list of authors]

Aulamo, Mikko (University of Helsinki); Bogoychev, Nikolay (University of Edinburgh); Heafield, Kenneth (University of Edinburgh); Ji, Shaoxiong (University of Helsinki); Kutuzov, Andrey (University of Oslo); Nail, Graeme (University of Edinburgh); Ramírez-Sánchez, Gema; Pyysalo, Sampo; Tiedemann, Jörg (University of Helsinki); van der Linde, Jelmer (University of Edinburgh); Zaragoza, Jaume (Prompsit)

High Performance Language Technologies (HPLT) is a 3-year EU-funded project (https://hplt-project.org/), building a space combining petabytes of natural language data with large-scale language model training. In our initial data release, we derived a collection of language data containing ca. 21.7 trillion tokens in 75 languages from 1.7PB web archive and common crawl packages. This dataset together with aligned parallel data coming from the same source will be used to build efficient and solid translation and multi-purpose large language models. HPLT relies on scalable workflows on HPC clusters such as LUMI and aims at free, sustainable and reusable resources, models and software packages for efficient training and inference of high-performance foundation models. The project combines an international team of academic, commercial and IT infrastructure partners, and also fosters collaborations with external partners and initiatives such as HuggingFace, SiloGen, OpenGPT-X and the OSCAR project.

86. Abdullah Tokmak (Aalto University): Safe learning-based control (CSO), (ML), (ROB) [click for abstract and full list of authors]

Tokmak, Abdullah (Aalto Univeristy); Fiedler, Christian (RWTH Aachen); Zeilinger, Melanie N. (ETH Zurich); Trimpe, Sebastian (RWTH Aachen); Köhler, Johannes (ETH Zurich); Schön, Thomas B. (Uppsala University); Baumann, Dominik (Aalto University)

In this poster, we present two research projects about safe learning-based control. The first project deals with approximating a model predictive controller (MPC). Although MPC has suitable properties for controlling complex systems, evaluating an MPC is computationally expensive. Thus, applying MPC under real-time requirements is challenging for fast applications like robotics. Hence, we aim to automatically compute a function that approximates the MPC while preserving closed-loop guarantees. To this end, we develop ALKIA-X, a novel algorithm based on kernel interpolation. In a numerical experiment, ALKIA-X approximates MPCs faster and results in a faster-to-evaluate approximate controller compared to the current state-of-the-art. In the second project, we work on sequential decision-making in robotics to find the optimal policy while guaranteeing safety. SafeOpt is a state-of-the-art algorithm that achieves this but requires a tight upper bound on the norm of the function we want to optimize in a reproducing kernel Hilbert space (RKHS). As this is unrealistic, we propose an algorithm that learns the RKHS norm from interacting with the environment and thus also works with misspecified RKHS norms. In a numerical experiment, we show that our method outperforms SafeOpt in terms of safety, convergence rate, and exploring the environment.

87. Nghiep Lucy Truong (Aalto University): Enhancing Conversations in Migrant Counseling Services: Designing for Trustworthy Human-AI Collaboration (HCI) [click for abstract and full list of authors]

Truong, Nghiep Lucy (Aalto University)

Migrants arriving or residing in Finland require a variety of counseling and support services, in municipal contexts such as the City of Espoo, to successfully integrate into Finnish society. This thesis conducts qualitative and user experience research using a series of participatory design methods to understand the challenges migrants face on arrival and then explores novel, theory-driven design concepts for improving communication between these users and municipal service advisors. The three main objectives of this thesis are: 1) understanding the challenges faced by service advisors and their customers in the context of migration to Finland, 2) designing and prototyping improved user experience concepts using participatory research methods, and 3) evaluating a proof-of-concept design approach in an appropriate use case scenario. The key findings are twofold. First, our empirical findings suggest that migrant customers often lack knowledge of how municipal services are organized and service advisors guide them in navigating complex information across different digital services. Second, our design concept highlights the importance of supporting the relevant intentions and expectations when evaluating the adequacy of AI-generated visual summaries that can enhance conversations between service advisors and migrant customers. The goals of the design research is not to replace service advisors, but to enhance their ability to better support migrant customers in real-time interactions. Finally, we offer theoretical implications. By reifying a conversation into a manipulable object, we can augment knowledge sharing. This digital boundary object can act as a memory aid when reused by customers and service advisors at a later date. Future research should focus on how AI-augmented service counseling can affect trust in computer-supported collaborative settings.

88. Lac Truong (Aalto University): Unveiling the Veiled Threat: An In-depth Analysis of the Impact of Bot Accounts on Vaccine Hesitancy and Misinformation During the COVID-19 Crisis (NLP), (Other) [click for abstract and full list of authors]

Ali, Unlu (THL); Truong, Lac (Aalto University); Sounny Slitine, Fatima (Aalto University); Sawhney, Nitin (Aalto University); Tammi, Tuukka (THL)

This article presents the findings of a study that investigated the impact of bots on the spread of COVID-19 misinformation and vaccine hesitancy on Twitter, covering three years of span. The study utilized three approaches, including big data analysis, examining public perception, and investigating network-related dimensions of the phenomenon. Text classification was used to differentiate between misinformation and vaccine stance detection, and the Turku University FinBERT pretrained embeddings model was employed. Botometer software was used to differentiate bot-like Twitter accounts from human-like ones. The study aimed to distinguish between malicious and non-malicious bot accounts using additional features. The research reveals that malicious bots possess unique features and functions that allow them to manipulate public discussions effectively, and they displayed a significant surge in activity within COVID-19-related discussions. Topics modeling analysis indicates that malicious bots concentrated on safety, political/conspiracy theories, and choice categories. The study emphasizes the need for developing effective strategies to detect and counter the influence of malicious bots, including using clear and concise language in health communication and establishing strategic partnerships. The findings highlight the critical role of bots in spreading COVID-19 and vaccine-related misinformation and the importance of identifying malicious bots for effective intervention strategies.

89. Ali Unlu (Terveyden ja Hyvinvoinnin Laitos): Setting the Misinformation Agenda: Modeling COVID-19 Narratives in Twitter Communities (ML), (NLP) [click for abstract and full list of authors]

Ali Unlu, Sophie Truong, Nitin Sawhney and Tuukka Tammi

This research investigates the dynamics of COVID-19 misinformation spread on Twitter within the unique context of Finland. Employing cutting-edge methodologies including text classification, topic modeling, social network analysis, and correspondence analysis, the study analyzes 1.6 million Finnish tweets from December 2019 to October 2022. Misinformation tweets are identified through text classification and grouped into topics using BERTopic modeling. Applying the Leiden algorithm, the analysis uncovers retweet and mention networks, delineating distinct communities within each. Correspondence analysis determines these communities' topical focuses, revealing how various groups prioritized different misinformation narratives throughout the pandemic. The findings demonstrate that influential, diverse communities introduce new misinformation which then spreads to niche groups. This agenda-setting effect is amplified by social media algorithms optimized for engagement. The results provide valuable insights into how online communities shape public discourse during crises through the strategic dissemination of misinformation.

90. Valter Uotila (University of Helsinki): SQL2Circuits: Estimating Metrics for SQL Queries with A Quantum Natural Language Processing Method (ML), (MULT), (NLP), (Other) [click for abstract and full list of authors]

Uotila, Valter (University of Helsinki); Lu, Jiaheng (University of Helsinki)

Quantum computing has developed significantly in recent years. Developing algorithms to estimate various metrics for SQL queries has been an important research question in database research since the estimations affect query optimization and database performance. This work represents a quantum natural language processing (QNLP) -inspired approach for constructing a quantum machine learning model that can classify SQL queries with respect to their execution times and cardinalities. From the quantum machine learning perspective, we compare our model and results to the previous research in QNLP and conclude that our model reaches similar accuracy as the QNLP model in the classification tasks. This indicates that the QNLP model is a promising method even when applied to problems that are not in QNLP. We study the developed quantum machine learning model by calculating its expressibility and entangling capability histograms. The results show that the model has favorable properties to be expressible but also not too complex to be executed on quantum hardware.

Srivastava, Akshat (Agilo Research Pvt. Ltd); Varanasi, Uttishta Sreerama (Aalto University); Sushchenko, Oleksandra (Aalto University); Teresa, Charlene Mae Sta (Raha International School); Bidhan, Ravinder Singh (Agilo Research Pvt. Ltd.)

In todays digital age, alongside the rising significance of technological literacy, algorithmic literacy holds great importance as artificial intelligence (AI) and machine learning (ML) emerge as transformative technologies. This research paper explores the trends, challenges, and implications of AI and ML education in schools, aiming to cultivate algorithmic literacy among students. By conducting a survey, reviewing the literature, and examining classroom observations, this study provides insights into the integration of AI and ML into the K12 curriculum and offers recommendations for promoting effective instruction and preparing students for the digital future. This research paper provides an introduction to the rising importance of AI and ML, highlighting the need for students to develop relevant skills and critical thinking. The literature review explores government initiatives aimed at promoting AI education in schools and examines the current trends in AI and ML instruction. Additionally, it delves into the challenges educators face when integrating AI and ML into the curriculum, including the selection of technical platforms, accessing adequate resources, and identifying comprehensive teaching materials. Core concepts such as algorithmic literacy and ethical AI require careful consideration in the education space as they form childrens ability to critically examine, interrogate, propose solutions for, contest, and agree with digital services that have become increasingly omnipresent.

92. Raúl Vázquez (University ofHelsinki): Releasing the MAMMOTH - a framework for modular multilingual translation models (ML), (MULT), (NLP) [click for abstract and full list of authors]

Vázquez, Raúl (University of Helsinki); Mickus, Timothee (University of Helsinki); Raganato, Alessandro (University of Milano-Bicocca); Loppi, Niki A. (NVIDIA); Grönroos, Stig-Arne (University of Helsinki, Silo AI); Tiedemann, Jörg (University of Helsinki)

N/A

93. Alexander Vedernikov (University of Oulu): Analyzing participants engagement during online meetings using unsupervised remote photoplethysmography with behavioral features (CV) [click for abstract and full list of authors]

Vedernikov, Alexander (University of Oulu); Sun, Zhaodong (University of Oulu); Kykyri, Virpi-Liisa (University of Jyväskylä); Pohjola, Mikko (University of Jyväskylä); Nokia, Miriam (University of Jyväskylä); Li, Xiaobai (University of Oulu)

Engagement measurement, an important and challenging task, finds application in healthcare, education, advertisement, and services. The use of physiological and behavioral features is viable, but the impracticality of traditional physiological measurement arises due to the need for contact sensors. This paper introduces a novel approach using unsupervised remote photoplethysmography (rPPG) to calculate physiological features for assessing engagement in online group meetings. The paper presents four main contributions. Firstly, a unique TSR dataset of online interactions among social workers is collected with granular engagement labels, offering insight into virtual meeting dynamics. Secondly, a pre-trained rPPG model is customized to reconstruct accurate rPPG signals from video meetings in an unsupervised manner, enabling the calculation of heart rate variability (HRV) features. Thirdly, the study demonstrates the feasibility of estimating engagement from HRV features using short observation windows, with a notable enhancement when using longer observation windows of two to four minutes. Fourthly, the effectiveness of behavioral cues is evaluated and fused with physiological data, which further enhances engagement estimation performance. An accuracy of 94% is achieved when only HRV features are used, eliminating the need for contact sensors or ground truth signals. The incorporation of behavioral cues raises the accuracy to 96%. Accurate engagement, measured through facial video analysis, provides a convenient solution beneficial for future applications, as evidenced by the studys diverse online meetings.

94. Prakhar Verma (Aalto University): Variational Gaussian Process Diffusion Processes (ML) [click for abstract and full list of authors]

Verma, Prakhar; Adam, Vincent; Solin, Arno

Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference approximating the posterior process as a linear diffusion process, point out pathologies in the approach, and propose an alternative parameterization of the Gaussian variational process using a continuous exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for the learning of model parameters.

95. Katja Voskoboinik (Aalto University): Automated Assessment of Task Completion in Spontaneous Speech for Finnish and Finland Swedish Language Learners (NLP) [click for abstract and full list of authors]

Ekaterina Voskoboinik; Yaroslav Getman; Ragheb Al-Ghezi; Mikko Kurimo; Tamás Grósz

This study investigates the feasibility of automated content scoring for spontaneous spoken responses from Finnish and Finland Swedish learners. Our experiments reveal that pretrained Transformer-based models outperform the tf-idf baseline in automatic task completion grading. Furthermore, we demonstrate that pre-fine-tuning these models to differentiate between responses to distinct prompts enhances subsequent task completion finetuning. We observe that task completion classifiers exhibit accelerated learning and produce predictions with stronger correlations to human grading when accounting for task differences. Additionally, we find that employing similarity learning, as opposed to conventional classification fine-tuning, further improves the results. It is especially helpful to learn not just the similarities between the responses in one score bin, but the exact differences between the average human scores responses received. Lastly, we demonstrate that models applied to both manual and ASR transcripts yield comparable correlations to human grading.

96. Haishan Wang (Aalto University): Heterophilous Triple Flows on Graphs (ML) [click for abstract and full list of authors]

Wang, Haishan (Aalto University); Sorlin, Arno (Aalto University); Garg, Vikas (Aalto University)

Generating molecules with desirable properties is key to domains like material design and drug discovery. The predominant approach is to encode molecular graphs using graph neural networks or their continuous-depth analogues. However, these methods often implicitly assume strong homophily (i.e., affinity) between neighbours, overlooking repulsions between dissimilar atoms and making them vulnerable to oversmoothing. To address this, we introduce HTFlows. It uses multiple interactive flows to capture heterophily patterns in the molecular space and harnesses these (dis-)similarities in generation, consistently showing good performance on chemoinformatics benchmarks.

97. Roman Yangarber (University of Helsinki): Is implicit assessment of language learning during practice as accurate as assessment through testing? (HCI), (MULT), (NLP) [click for abstract and full list of authors]

Hou, Jue (UH); Katinskaia, Anisia (UH); Vu, Anh-Duc (UH); Yangarber, Roman (UH)

Assessment of proficiency of the learner is an essential part of Intelligent Tutoring Systems (ITS). We use Item Response Theory (IRT) in computer-aided language learning for assessment of student ability in two contexts: in test sessions, and in exercises during practice sessions.

Exhaustive testing across a wide range of skills can provide a detailed picture of proficiency, but may be undesirable for a number of reasons. Therefore, we first aim to replace exhaustive tests with efficient but accurate adaptive tests. We use learner data collected from exhaustive tests under imperfect conditions, to train an IRT model to guide adaptive tests. Simulations and experiments with real learner data confirm that this approach is efficient and accurate.

Second, we explore whether we can accurately estimate learner ability directly from the context of practice with exercises, without testing. We transform learner data collected from exercise sessions into a form that can be used for IRT modeling. This is done by linking the exercises to linguistic constructs; the constructs are then treated as "items" within IRT.

We present results from large-scale studies with thousands of learners. Using teacher assessments of student ability as "ground truth", we compare the estimates obtained from tests vs. those from exercises. The experiments confirm that the IRT models can produce accurate ability estimation based on exercises.

98. Victor Manuel Yeom Song (University of Helsinki): Learning How Humans Play Board Games with GPT-4IAR (GAME), (HAI), (ML), (MULT), (PLAN) [click for abstract and full list of authors]

Yeom-Song, Victor; Lin, Daisy; Kuperwajs, Ionatan; Schütt, Heiko; Ma, Wei Ji; Acerbi, Luigi

Transformer neural networks have excelled in various sequential data tasks, dominating Natural Language Processing and performing competitively in fields like Time Series forecasting and Reinforcement Learning. In this study, we explore whether this success extends to cognitive science, specifically in capturing the complex structure of human planning and decision-making while playing board games. We introduce GPT-4IAR, a GPT architecture tailored for predicting human gameplay and statistics in 4-in-a-row, an intermediate-complexity game. Prior research addressed this task with an interpretable cognitive model and a fully-connected neural network. Our contributions include using a transformer architecture for sequence-based analysis, affording a comparative study with the fully-connected model. We also extend prediction to various human statistics, including move duration. Preliminary results on synthetic gameplay data indicate our architecture's potential, achieving a cross-entropy loss of 1.69 and a move prediction accuracy of 46.3%, Tentatively, this shows the potential to surpass the fully connected network that was trained on human data which achieved a loss of 1.866 and accuracy of 41.71%. We also show promise in predicting discretized reaction times with 21.3% accuracy and an average RMS error of 3.23 bins out of 20, opening avenues for further extensions like Elo score conditioning or score inference based on move sequences, enhancing the utility of our method in cognitive science research. The next step of our approach is to train our model on a large-scale dataset of human games, composed of approximately 10 million moves gathered from the mobile app Peak, in a joint effort with the Wei Ji Ma Lab at New York University, with our final goal being to accurately simulate how humans learn and plan in this context. By leveraging our trained model, cognitive scientists will be able to systematically explore key questions related to human learning and planning.

99. Anssi Yli-Jyrä (Tampere University): Pruning Redundancy in Answer Set Optimization Applied to Preventive Maintenance Scheduling (KRR), (MULT), (PLAN), (XAI) [click for abstract and full list of authors]

Yli-Jyrä, Anssi; Feyzbakhsh Rankooh, Masood; Janhunen, Tomi

Multi-component machines deployed, e.g., in paper and steel industries, have complex physical and functional dependencies between their components. This profoundly affects how they are maintained and motivates the use of logic-based optimization methods for scheduling preventive maintenance actions. Recently, an abstraction of maintenance costs, called miscoverage, has been proposed as an objective function for the preventive maintenance scheduling (PMS) of such machines. Since the minimization of miscoverage has turned out to be a computationally demanding task, the current paper studies ways to improve its efficiency. Given different answer set optimization encodings of the PMS problem, we motivate constraints that prune away some sub-optimal and otherwise redundant or invalid schedules from the search space. Our experimental results show that these constraints may enable up to ten-fold speed-ups in scheduling times, thus pushing the frontier of practically solvable PMS problem instances to longer timelines and larger machines.

100. Nikita Zeulin (Tampere University): Resource-Efficient Federated Hyperdimensional Computing (ML), (MULT) [click for abstract and full list of authors]

Zeulin, Nikita (Tampere University); Galinina, Olga (Tampere University & Tampere Institute for Advanced Study); Himayat, Nageen (Intel Corporation); Andreev, Sergey (Tampere University)

Hyperdimensional computing (HDC) is an energy- and compute-efficient machine learning (ML) paradigm that promises to become an alternative to neural networks for resource-constrained devices. As for every ML algorithm, the predictive performance of HDC models benefits from a wider diversity of the training data. Federated learning promotes this diversity by collaboratively training a common ML model over multiple user devices without explicitly disclosing personal training data of the participants. However, having a large number of participants creates a significant load on the communication network (often, wireless network), which can be alleviated by reducing the size of the trained HDC model by the cost of its predictive performance. In this work, we propose a resource-efficient federated hyperdimensional computing (RE-FHDC) framework reducing computational, communication, and energy resources required for the federated HDC model training without sacrificing its predictive performance. We achieve this by training multiple smaller independent HDC sub-models and refining the concatenated HDC model using the proposed dropout-inspired procedure. Our numerical comparison demonstrates that the proposed framework achieves a comparable or higher predictive performance while consuming less computational and wireless resources than the baseline federated HDC implementation.