Jingwen Cai

I’m a Ph.D. student in the Foundations of Language Processing group at Umeå University.

My research brings together reinforcement learning with online advertising as well as finite-string transductions, with a focus on creating more effective and optimized solutions.

My work also involves a great deal of user studies and human–computer interaction, aiming to bridge the gap between theory and practice in real-world scenarios.

Contact: jingwen.cai@umu.se

Education

Umeå University, Sweden· Doctoral Student2021 – present
Durham University, UK· Master of Science in Scientific Computing and Data Analysis
Northwestern Polytechnical University, China· Bachelor of Engineering in Automation

Publications

Developing a multilingual corpus of wikipedia biographies
RANLP2023
PDF
Read the abstract
For many languages, Wikipedia is the most accessible source of biographical information. Studying how Wikipedia describes the lives of people can provide insights into societal biases, as well as cultural differences more generally.We present a method for extracting datasets of Wikipedia biographies. The accompanying codebase is adapted to English, Swedish, Russian, Chinese, and Farsi, and is extendable to other languages. We present an exploratory analysis of biographical topics and gendered patterns in four languages using topic modelling and embedding clustering. We find similarities across languages in the types of categories present, with the distribution of biographies concentrated in the language’s core regions. Masculine terms are over-represented and spread out over a wide variety of topics. Feminine terms are less frequent and linked to more constrained topics. Non-binary terms are nearly non-represented.
Optimizing Contextual Advertising Through Real-Time Bidding With Budget Constraints
SURE@RecSys2024
PDF
Read the abstract
Online advertising opportunities are bought and sold in automated auctions driven by real-time bidding. In the case of contextual advertising, the size of a bid is informed by the media context in which the ad will be displayed. In contrast to personalised advertising, contextual advertising is better aligned with privacy acts such as GDPR and CCPA. We investigate how reinforcement learning with human feedback can help optimise contextual advertising under budget constraints. We propose a dynamic epsilon-greedy algorithm that considers the rate of budget consumption during a finite transaction time. The goal is to maximise long-term rewards in a sustainable manner. Our comparative evaluation of fundamental reinforcement learning algorithms on real data suggests that the approach is feasible and effective.
From Precision to Perception: User Surveys in the Evaluation of Keyword Extraction Algorithms
HILDA2025
PDF
Read the abstract
Stricter regulations on personal data are causing a shift towards contextual advertising, where keywords are used to predict the topical congruence between ads and their surrounding media contexts — an alignment shown to enhance advertising effectiveness. Recent advances in AI, particularly large language models, have improved keyword extraction capabilities but also introduced concerns about computational cost. This study conducts a comparative, survey-based evaluation experiment of three prominent keyword extraction approaches, emphasising user-perceived accuracy and efficiency. Based on responses from 552 participants, the embedding-based approach emerges as the preferred method. The findings underscore the importance of human-in-the-loop evaluation in real-world settings.
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising
Link
Read the abstract
Keyword extraction is a foundational task in natural language processing, underpinning countless real-world applications. A salient example is contextual advertising, where keywords help predict the topical congruence between ads and their surrounding media contexts to enhance advertising effectiveness. Recent advances in artificial intelligence, particularly large language models, have improved keyword extraction capabilities but also introduced concerns about computational cost. Moreover, although the end-user experience is of vital importance, human evaluation of keyword extraction performances remains under-explored. This study provides a comparative evaluation of three prevalent keyword extraction algorithms that vary in complexity: TF-IDF, KeyBERT, and Llama 2. To evaluate their effectiveness, a mixed-methods approach is employed, combining quantitative benchmarking with qualitative assessments from 552 participants through three survey-based experiments. Findings indicate a slight user preference for KeyBERT, which offers a favourable balance between performance and computational efficiency compared to the other two algorithms. Despite a strong overall preference for gold-standard keywords, differences between the algorithmic outputs are not statistically significant, highlighting a long-overlooked gap between traditional precision-focused metrics and user-perceived algorithm efficiency. The study highlights the importance of human-in-the-loop evaluation methodologies and proposes analytical tools to support their implementation.
Reinforcement Learning of Finite-State String Transductions
Journal of Automata, Languages and Combinatorics
PDF
Read the abstract
Finite state transducers (FSTs) are a valuable tool in data processing systems, where they are used to realise string-to-string transductions. We consider the problem of inferring transductions representable by FSTs through reinforcement learning. In this machine-learning paradigm, a learning algorithm repeatedly interacts with an environment by performing one out of a fixed set of candidate actions. Each action taken yields a reward, the size of which depends on the environment's current state, and causes the environment to change into a new state. The algorithm's objective is to maximise the accumulated reward. In the setting explored here, the environment consists of the next symbol in the input string to be rewritten and a transducer state. An action consists in choosing the symbol to output next, and the transducer state to shift into, thus causing a change in the environment. We propose a learning algorithm that starts out from a singleton set of states, and every time the learning rate stagnates, splits a state into two. For the split, it chooses a state that has been visited often, but despite this provides little information about how to maximise the reward. We evaluate the algorithm through empirical experiments, and the results suggest that it is robust enough to handle situations where the target transduction changes during the learning process.
Reinforcement Learning in Online Advertising: Challenges, Prospects, and Trust
ReliableAI@ACML2024
PDF
Read the abstract
The central decision-making processes involved in online advertising are often supported by Reinforcement Learning (RL), which serves to optimise long-term accumulative rewards through interactions with evolving environments. While RL's potential in various real-world applications has been reviewed in extant survey works, the specific ways RL algorithms address online advertising challenges remain unchartered. Therefore, this paper reviews RL applications in this practice area, identifying core challenges and key issues including trust concerns. We categorize reviewed work based on problem domains and propose potential directions for future research. Our goal is to bridge the cross-disciplinary gap in this field, offering perspectives and guidance for researchers and practitioners.

Teaching

Computational Complexity (HT24)

More About Me

I enjoy building Lego sets and 3D metal models!

I love riding scooters and exploring trails --- though to be honest, it’s more walking than hiking, since steep climbs aren’t really my thing :p.

My fridge is basically a gallery. I’m obsessed with collecting magnets!

I can play the zither and hold a Level 9 Certificate of Qualification. Want to hear what it sounds like? Click here!