Master Thesis Project - 2026
We usually respond within two weeks
1. Reinforcement Learning for Large Language Models (LLMs) Thesis Project
Background & Description
Modulai is offering a master’s thesis opportunity focused on applying Reinforcement Learning (RL) to improve the capabilities of large language models (LLMs). Reinforcement learning has been pivotal in aligning LLMs with human preferences. Recent works show its potential extends further, enabling models to acquire advanced problem-solving strategies and adapt to complex tasks.
Recent advancements highlight the transformative role of RL in LLM post-training:
DeepSeekMath explored how reinforcement learning can enable models to handle multi-step mathematical reasoning. It also introduced a novel RL method, Group Relative Policy Optimization (GRPO).
Tulu 3 introduced a family of fully-open post-trained models, leveraging Supervised Fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique dubbed Reinforcement Learning with Verifiable Rewards (RLVR).
ReTool introduces reinforcement learning for tool use, showing how LLMs can learn to combine text-based reasoning and code interpreters for complex tasks.
This project aims to investigate RL approaches for improving LLMs in specialized domains (such as reasoning and tool use). You will explore open-weight models, implement RL methods inspired by the latest research, and evaluate how reinforcement learning impacts model capabilities. Through this work, you will contribute to the growing understanding of how RL can shape the next generation of LLMs.
ML techniques and tools
- Open-weight LLMs
- Reinforcement learning for LLMs
- Python, PyTorch, Git, Hugging Face
References
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models: https://arxiv.org/abs/2402.03300
Tulu 3: Pushing Frontiers in Open Language Model Post-Training: https://arxiv.org/abs/2411.15124
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs: https://arxiv.org/abs/2504.11536
2. Vision-Language-Action models for intelligent robotics control (STHLM)
Background & Description
We also offer a master's thesis project in the emerging field of Vision-Language-Action (VLA) models for robotics. VLA models unify computer vision, natural language processing, and robotic control into end-to-end systems, enabling robots to understand visual scenes, interpret human instructions, and execute tasks without manual programming.
Recent research (e.g., Liang et al., 2024) shows that VLA models can perform complex tasks such as “pick up the red mug from the cluttered table.” This thesis invites students to explore and advance these models, contributing to one of the most actively researched directions in AI-powered robotics.
The project scope will be flexible and tailored to the student’s interests and research findings. Students will work with state-of-the-art robotic hardware, GPU clusters, and receive guidance from experts in AI and robotics.
ML Techniques and Tools
- Python, PyTorch, Git, Hugging Face
- Vision-language-action models (multi-modal AI)
- Computer vision and natural language processing methods
- Real-time control systems and robotic integration
References
Liang et al., Vision-Language-Action Models for Robotics, 2024.
arXiv:2406.09246
3. Open Application within Applied Machine Learning
Applied Machine Learning projects encompass a wide range of domains, including healthcare, finance, natural language processing, computer vision, and more. This open application invites students to choose projects aligned with their interests and career goals. Do you have an idea - let us know what it's about by describing it.
Required Skills
Finishing a master's in machine learning or a master's in another field but with courses in machine learning and programming added
Please include the following in your application:
- Link to relevant GitHub account if available.
- Grades for bachelor's and master's.
- Updated CV or an updated LinkedIn profile.
*Suitable candidates will be called to one interview before making a final decision.
The last date for application will be the 31th of October, but if suitable candidates apply, the process will end beforehand.
About Modulai
Modulai’s clients range from startups to multinational companies. They all share that machine learning is central to how they operate, compete, and create value.
Our services range from advisory projects and feasibility studies to end-to-end development and refinement of machine learning systems and products.
We use state-of-the-art techniques, always focusing on maximizing business impact, delivering solutions in areas such as credit risk, fraud detection, dynamic pricing, recommendation systems, computer vision, natural language processing, opportunity spotting, logistics optimization, up-sell, cross-sales, smart building optimization, predictive maintenance, and route planning.
Other
When doing a master thesis project at Modulai, you are invited to all team activities such as daily stand-ups, weekly learning breakfasts, monthly AWs, and other team activities. We look forward to having you as part of our team!
- Teams
- ML engineering
- Role
- Master Thesis Student
- Locations
- Stockholm, Gothenburg
- Remote status
- Hybrid

Bringing fun to work
We enjoy traveling together, visiting innovative companies and organizations, new cities, and memorable places.
We spend an incredible amount of time of our lives at work; therefore, we believe that we become more successful as a team and business by incorporating fun into work. Daily life at work involves breaks for video games, dog cuddles, and just hanging out. We are a group of friends and make sure to experience life together both inside and outside of company life. Yearly skiing trip, visiting AI conferences and company kickoffs have become appreciated traditions by now.
Already working at Modulai?
Let’s recruit together and find your next colleague.