KAUST Rising Stars in AI Symposium 2026


February 9 - 11, 2026 

8:00 AM – 5:00 PM 

Building 19

KAUST - Thuwal

Saudi Arabia


This event is sponsored by:

Office of Research Funding and Services 

ABOUT THE SYMPOSIUM

Artificial intelligence is reimagining the modern world. It is reshaping science, industry, governance, and daily life. It holds the power to transform challenges into opportunities for the Kingdom and humanity.

King Abdullah University of Science and Technology (KAUST) and the KAUST Center of Excellence for Generative AI continue to proudly support this transition, hosting the fifth annual Rising Stars in AI Symposium.

Tailored for Ph.D. students, postdoctoral researchers, and early-career faculty advancing the field of AI, this three-day event enables the exchange of advanced concepts and emerging discoveries from around the world.

Join a dynamic exploration of AI. Apply to be a speaker and inspire global colleagues at the in-person symposium. Together, these “Rising Stars” will help shape the most transformative technological change in history.

Call for speakers:

The symposium committee will select speakers based on outstanding research achievements. Presenters will have the opportunity to share work recently recognized at world-class AI conferences such as NeurIPS, ICML, ICLR, CVPR, ECCV, ICCV, EMNLP, ACL, and COLT.

KAUST and the Center of Excellence for Generative AI invite applications from early-career researchers with notable contributions in machine learning — spanning areas such as computer vision, optimization, reinforcement learning, natural language processing, applied AI, and beyond.

Travel and accommodation expenses will be covered for all selected speakers


Application is closed.

Deadline to apply: November 15, 2025

Notification date: December 1, 2025

Call for poster presentation

We invite KAUST researchers, as well as students, postdocs, and faculty from institutions across Saudi Arabia, to showcase their recently published AI work in the poster sessions during the Symposium. Please note that external presenters will be responsible for

covering their own travel expenses. Selected applicants will be invited to present their poster during the designated poster session.

If you are interested in presenting your work, please complete the application form:


Deadline for poster application: 22 January 2026

Notification of poster acceptance: 29 January 2026


Attending the Symposium

The Rising Stars in AI Symposium will be limited to in-person attendance.

Please register using the link

AGENDA

  • Day 1: Feb 9, 2026 
  • Day 2: Feb 10, 2026
  • Day 3: Feb 11, 2026
Day 1: Feb 9, 2026 

Session 1-1

  08:30 – 09:00 

Coffee & Registration

  09:00– 09:20

Welcome Remarks

Dr. Arwa Al-Aama (KAUST)

  09:20 – 09:40

Composing Behaviors, Exposing Risks: Security Lessons from Task Arithmetic and Semantic Embeddings

Yu-Lin Tsai (University of California, Berkeley)

Abstract

Abstract:

Modern foundation models exhibit an intriguing linearity in their latent spaces: tasks, styles, and even harmful behaviors can be added, subtracted, and recombined through simple vector arithmetic. This phenomenon—known as task arithmetic—offers a powerful tool for model editing and transfer learning, but it also exposes a critical security vulnerability: malicious or biased behaviors can be silently composed and inherited by downstream models.

In this presentation, I examine how semantic embeddings and low-rank adaptation interact to shape the compositional behavior of large models. Building on my work Ring-A-Bell (ICLR 2024) and recent follow-ups under review USENIX where also we present SafeLoRA bridges the gap between empirical fine-tuning and safety alignment guarantees

Together, these insights highlight a broader principle: compositionality is both a blessing and a liability. This work aims to make model composition not just controllable, but provably safe—laying the foundation for trustworthy, interoperable AI systems that can be adapted and deployed with confidence.


09:40–10:00

Human-Like Artificial Intelligence

Badr AlKhamissi (EPFL)

Abstract

Abstract:

This talk explores how machine learning, neuroscience, and cognitive science can be integrated to advance the development of human-like artificial intelligence. It will highlight recent progress in representational alignment, where internal representations in state-of-the-art models increasingly reflect the activity patterns of human brain networks, and behavioral alignment, where model performance begins to mirror human reasoning, decision-making, and broader cognitive abilities.

By combining large-scale neural modeling with cognitive and neuroimaging data, the talk examines not only what modern models can accomplish, but how closely their underlying computations parallel those of the human mind. This cross-disciplinary approach benefits AI and neuroscience alike: models serve as computational hypotheses about brain function, offering precise, testable predictions that help scientists better understand the organization and mechanisms of human cognition.

The presentation also outlines a long-term vision for building digital twins of the human brain: individualized computational models grounded in real neural and behavioral data. Such models could enable large-scale in-silico experiments, accelerating cognitive neuroscience research and offering powerful tools for simulating interventions, probing cognitive deficits, and informing clinical and medical applications that are otherwise difficult or impossible to test directly.

Together, these directions point toward a future where artificial systems deepen our scientific understanding of the brain, support translational neuroscience, and contribute to the development of interpretable, human-centric AI.

10:00–11:00

Coffee break & Poster Presentation

Session 1-2

11:00–11:20

Optimization under Heavy-Tailed Noise: Optimal Sample Complexities and the Role of Adaptivity

Florian Hübler (ETH Zurich)

Abstract

Abstract:

Recent empirical evidence indicates that gradient noise in large-scale machine learning is often heavy-tailed, violating the standard bounded-variance assumption. Until recently, the literature concluded that vanilla stochastic gradient descent (SGD) “diverges” in such regimes and that gradient clipping is required. Yet existing guarantees do not match the lower bound and assume knowledge of problem-dependent parameters. In this presentation, I will discuss two recent advances.

First, replacing clipping with normalization yields optimal nonconvex sample-complexity guarantees that match the lower bound when the parameters are known. In the parameter-free setting, we show that convergence is still possible, although with worse sample complexity. Second, I revisit the necessity of adaptivity: we show that SGD attains minimax-optimal rates in convex and strongly convex problems under an appropriate convergence criterion, and we give tight sample-complexity bounds for Hölder-smooth nonconvex objectives.

In sum, adaptivity enables optimal and parameter-free performance, but it isn’t mandatory and SGD remains a rigorous baseline under heavy tails.


11:20–11:40

From Pretraining Exploitation to Discovery: Towards Self-Evolving LLMs

Yang Yue (Tsinghua University)

Abstract

Abstract

In this talk, building on our analysis in “Limit-of-RLVR: Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?” (NeurIPS 2025 Best Paper Runner-up Award), we identify a fundamental limitation of current RLVR methods. Through comprehensive evaluations across pass@k, perplexity, and accuracy distributions on open-weights models, we demonstrate that RLVR primarily amplifies patterns from pretraining (exploitation) rather than discovering novel strategies (exploration). Even if such discoveries do occasionally occur, we argue that this process is highly inefficient, where substantial computational costs yield disproportionately marginal gains in novelty.


We hypothesize that a key cause is the inability to effectively explore the vast action space of LLMs when learning is driven by outcome-based feedback. To bridge this gap, I propose a research agenda: designing new scalable RL paradigms that enable LLMs to discover new capabilities more efficiently.


Specifically, I will outline a roadmap centered on deliberate data scaling and advanced exploration mechanisms. For data scaling, we begin with AbsoluteZero, our preliminary work on deliberate RL data/env evolution through self-play. To advance exploration in RL, we then present early results from our ongoing work on self-evolution–based exploration mechanisms, which demonstrate substantially higher exploration efficiency and a higher performance ceiling on challenging, open-ended mathematical problems compared to standard i.i.d. sampling for RLVR.


11:40–12:00

The Challenge of Test Awareness in AI Evaluation

Prof. Prof. Sahar Abdelnabi (ELLIS Institute Tübingen and the Max-Planck Institute for Intelligent Systems)

Abstract

Abstract:

Reasoning-focused LLMs sometimes alter their behavior when they detect that they are being evaluated—which can lead them to optimize for test-passing performance or to comply more readily with harmful prompts if real-world consequences appear absent. In this talk, I will present the first quantitative study of how such “test awareness” impacts model behavior, particularly its performance on safety-related tasks. In our recent paper, we introduced a white-box probing framework that (i) linearly identifies awareness-related activations and (ii) steers models toward or away from test awareness while monitoring downstream performance. We applied our method to different state-of-the-art open-weight reasoning LLMs across both realistic and hypothetical tasks (denoting tests or simulations). Our results demonstrate that test awareness significantly impacts safety alignment (such as compliance with harmful requests and conforming to stereotypes) with effects varying in both magnitude and direction across models. We found that such “test awareness” can affect model behavior in two opposing directions: models may refuse harmful questions when they recognize a safety test, or they may comply when they perceive a task as hypothetical with no real-world consequences. This inconsistency undermines the validity of safety assessments as evaluation behavior diverges from deployment behavior. By providing control over this latent effect, our work aims to provide a stress-test mechanism and increase trust in how we perform safety evaluations.


12:00–12:15

Group Photo

12:15–14:00

Lunch break

Session 1-3

14:00–14:20

Research Overview at KAUST

Prof. Bernard Ghanem (KAUST)

14:20–14:40

Quantifying the Cost of Trust in Machine Learning

Youssef Allouah (Stanford University)

Abstract

Abstract:

As machine learning models become core societal infrastructure, their trustworthiness is a primary scientific challenge. Moving beyond accuracy, my research provides formal guarantees on three critical properties: privacy, to protect data confidentiality; robustness, to ensure integrity against malicious data; and unlearning, to provide meaningful user control.

This presentation introduces a unified framework to analyze these pillars through their quantifiable costs—in model utility, computation, and system assumptions. This ""cost of trust"" lens uncovers deep interactions between these guarantees.

We show they can be antagonistic, as when differential privacy mechanisms unfortunately hinder model robustness to adversarial attacks. Conversely, they can be synergistic, as we demonstrate how robust training provides a foundation for developing highly efficient and certified machine unlearning algorithms. Finally, our framework reveals fundamental separations. We show that unlearning, while often conflated with privacy, is a distinct goal. We prove it can be achieved at a significantly lower cost in model utility, enabling practical ""right to be forgotten"" guarantees without the steep performance trade-offs of differential privacy.

This work, published at ICML, NeurIPS, and ICLR, provides a foundational map of these interactions, contributing to a principled science of trustworthy machine learning.

14:40–15:00

Demystifying the Role of Data in Video Understanding

Muhammad Maaz (Mohamed Bin Zayed University of AI)

Abstract

Abstract:

This talk will explore key ingredients for research on foundation models for short-form video understanding, with a particular emphasis on the role of data. During the initial decade of the deep learning revolution, beginning with AlexNet, the visual recognition field primarily focused on creating innovative architectures while relying on fixed datasets like ImageNet for training. In recent years, in pursuit of better benchmark performance, the community has begun to aggregate diverse training sources, including distillation from proprietary black-box models. Nevertheless, the role of data is still rarely discussed with the same importance as model design or training algorithms. This talk will demystify the impact of large-scale data on multimodal models trained via text supervision, the effect of synthetic data and the design principles behind data engines for training robust video understanding models.

15:00–16:00

Coffee Break and Poster Presentation

Session 1-4

16:00–16:20

Discovering Knowledge with Open-Ended Agentic Systems

Shengran Hu (University of British Columbia)

Abstract

Abstract:

This talk will explore how language, when harnessed through Foundation Models (FMs), accelerates knowledge discovery. Unlike traditional numerical vector representations, language as a unified representation allows FMs to operate within a more expressive space, facilitating faster discovery, knowledge transfer, and accumulation of valuable stepping-stones that are interpretable to humans. Specifically, agentic systems powered by FMs can discover knowledge expressed in natural language, which may be scientifically valuable in its own right, or that can improve the agent itself, enabling recursive self-improvement. Shengran will discuss the evolving research in Open-endedness and highlight a paradigm shift driven by FMs, using his work on autonomous scientific discovery and self-improving systems as examples.



16:20–16:40

Towards language technologies that serve everyone

Prof. Antonios Anastasopoulos (George Mason University)

Abstract

Abstract:

Language technologies, despite all the incredible recent progress, do not yet robustly work for _everyone_. In this talk I will first summarize some of my group's recent work on addressing challenges we are still facing in the real world, such as handling language varieties (dialects), minority languages from bilingual communities, and catering to the needs of indigenous and endangered language communities. I will discuss solutions around data curation, modeling, and models' cultural adaptation, as well as some exciting preliminary results on incorporating human knowledge into LLMs in a way that reduces data requirements.

16:40–17:00

Accelerating Chemistry Discovery with Artificial Intelligence: From Representation Learning to Autonomous Agents

Yanqiao Zhu (University of California, Los Angeles)

Abstract

Abstract: 

This presentation explores the potential of artificial intelligence in accelerating chemistry discovery, spanning from foundational representation learning to autonomous research agents. I will present a comprehensive journey through interconnected research contributions that collectively advance AI-driven chemical discovery.

First, I introduce MARCEL, a comprehensive benchmark that evaluates molecular representation learning models on conformer ensembles. Building on its insights, I present SPiCE, which addresses a fundamental limitation in molecular representation learning. I will also introduce our recent work in autonomous chemical discovery with large language model agents. I will demonstrate its potential in chemical reaction optimization that combines the reasoning capabilities of large language models with structured optimization techniques while maintaining interpretability through literature-grounded decision-making.

Together, these contributions demonstrate how AI can bridge computational efficiency and chemical intuition, ultimately accelerating scientific discovery and transforming chemical research methodologies.

Day 2: Feb 10, 2026

Session 2-1

  08:30 – 09:00 

Coffee & Registration

  09:00– 09:40

Keynote speaker

Dr. Yaser Al-Onaizan (Deputy CEO & AI Products President - Humain)

  09:40 – 10:00

Building Trustworthy AI-driven Systems: From Abstraction to Safety to Reliability

Yajie Zhou (University of Maryland, College Park)

Abstract

Abstract:

As AI-based system design increasingly powers, a central challenge emerges: how can we ensure that AI decisions are trustworthy—that they align with operator intent, remain safe, and generalize to new environments? My research tackles this challenge from three complementary directions: abstraction, safety, and reliability.

First, NetPress introduces a unified abstraction for AI-driven network system evaluation. It dynamically generates millions of realistic, emulator-backed benchmarks that capture correctness, safety, and latency. This abstraction provides the foundation for scalable and realistic testing of LLM agents beyond static datasets.

Building on this, MeshAgent brings safety to the forefront. It extracts domain-specific invariants from natural-language queries and encodes them as constraints guiding LLM reasoning and validation. By constraining generation with learned rules, MeshAgent improves both precision and abstention behavior, achieving over 95% accuracy and ensuring safe adaptation across network tasks.

Finally, Genet advances reliability through adaptive training. By identifying “difficult” network environments via rule-based baselines, it applies curriculum learning to produce RL policies that generalize better across workloads and outperform traditional heuristics.

Together, these systems form a coherent vision: trustworthy AI-driven systems built on principled abstraction, safety-aligned reasoning, and reliability-oriented learning. This approach moves us closer to autonomous systems that we can understand, verify, and trust to operate critical real-world tasks.


10:00–11:00

Coffee break & Poster Presentation

Session 2-2

11:00–11:20

Towards Robust Anatomical Segmentation of Medical Images: From Dense Labels to Graph Structures

Prof. Enzo Ferrante (University of Buenos Aires / CONICET)

Abstract

Abstract:

The evolution of deep segmentation networks has empowered the enhancement of extensive medical imaging datasets with automatically generated anatomical segmentation masks. In this talk we will discuss recent methods we proposed to improve anatomical plausibility in deep segmentation networks. By improving anatomical plausibility we mean to ensure that the segmentation masks produced by our network are constrained to the actual shape and appearance of organs. We will briefly discuss some of our studies which use autoencoders to learn low dimensional embeddings of anatomical structures and propose different ways in which they can be incorporated into deep learning models for segmentation and registration, and used to discover novel phenotype-genotype associations.

The complexity is further intensified by recent studies indicating potential biases in AI-based medical imaging models related to gender, age, and ethnicity. Here we will share insights from our journey in developing the CheXMask large-scale database of x-ray anatomical segmentations. We will delve into the strategies we implemented for automatic quality control and the methods we formulated for unsupervised bias discovery in the absence of ground-truth annotations.


11:20–11:40

Implicit Bias of SGD

Prof. Cong Fang (Peking University)

Abstract

Abstract

Stochastic Gradient Descent (SGD) is a widely used algorithm for solving machine learning problems. In high-dimensional learning, SGD often requires fewer iterations than the number of model parameters, and its implicit regularization effect is key to achieving strong generalization. This talk explores the generalization performance of SGD applied to simple models across different learning scenarios, with a focus on quantitative comparisons. We will analyze the algorithm’s efficiency under varying learning scales—such as different sample size–dimension relationships—as well as under covariate shift conditions. Our goal is to understand SGD’s adaptability and the conditions for emergent learning behaviors. The theoretical insights gained will inspire the design of memory-efficient training algorithms for large models, leading to improved performance on standard benchmarks such as GPT-2.

11:40–12:00

Maximally Informative, Minimally Demanding: Learning from Human Feedback

Prof. Erdem Bıyık (University of Southern California)

Abstract

Abstract:

The robot learning community has increasingly turned to large models with the hope of getting good generalization across tasks and environments. Yet, even the most capable zero-shot models benefit substantially from in-domain fine-tuning with human feedback. For broad deployment of robots, we must therefore develop methods that extract the most information from the least amount of human feedback: maximally informative, minimally demanding. In this talk, I will present approaches to modeling diverse feedback signals, such as demonstrations, preference comparisons, comparative language, interventions, and eye gaze, that make robot learning algorithms more data-efficient without placing extra burden on users. In addition, I will show how active learning/querying approaches can be integrated into these human-in-the-loop learning algorithms to further improve sample-efficiency. Ultimately, the goal is to leverage all available human inputs in a principled, efficient way, enabling pretrained large models to adapt to new environments and tasks through intuitive interaction with non-expert users.


12:00–14:00

Lunch break

Session 2-3

14:00–14:20

Research Overview at KAUST

Prof. Francesco Orabona (KAUST)

14:20–14:40

New Frontiers of Edge Physical Intelligence: Extreme Quantization, Limitless Memory, and Rapid Evolution

Zicong Hong (EPFL)

Abstract

Abstract:

Physical intelligence—intelligence embodied in edge systems such as robots and autonomous vehicles—requires AI models that can not only understand and describe the world, but also act within it. Achieving this demands models that are both computationally practical and behaviorally grounded. We explore a unified approach that integrates extreme quantization to dramatically reduce computational and deployment cost, limitless memory mechanisms to support ultra-long contextual reasoning for sustained decision-making, and world-model-driven rapid evolution that enables agents to learn and evolve efficiently through interaction across simulated and real environments. These capabilities form a reinforcing cycle: efficient models make continuous learning feasible on-device; long-context memory preserves task and environmental history; and fast evolution accelerates the acquisition of physical skills. Together, they move AI from being merely expressive toward becoming embodied, adaptive, and capable of competent action in the physical world.


14:40–15:00

From Compression to Selection: Better and Longer Video Understanding

Enxin Song (Zhejiang University)

Abstract

Abstract:

My research aims to advance efficient and scalable video understanding through the design of intelligent compression and sparse attention mechanisms in Video Large Language Models (Video-LLMs). The central challenge in long-video reasoning lies in balancing efficiency and performance, that is, how to process thousands of frames without losing crucial temporal or contextual information.

I began addressing this challenge with MovieChat (CVPR 2024), one of the first frameworks and benchmarks for long-video understanding, where I introduced a memory-based mechanism that has influenced many subsequent works. This exploration evolved through a series of efficiency-oriented studies: MovieChat+ with question-aware compression, AuroraCap (ICLR 2025) with spatial compression for detailed captioning, and AuroraLong with Linear RNNs for long-context reasoning. Together, these projects demonstrate a consistent pursuit of computationally sustainable yet semantically rich video reasoning.

However, benchmarking on complex tasks such as Video-MMLU revealed the inherent limitations of static compression. This realization inspired my latest project, VideoNSA, which redefines efficiency as selective activation rather than information reduction, employing sparse attention to dynamically allocate computation across time.

This presentation will trace my research journey along what I call the “efficiency–performance spiral,” highlighting key designs, empirical insights, and open questions about scaling Video-LLMs. Beyond technical contributions, I aim to promote a transparent and reproducible research culture that emphasizes rigorous evaluation and honest observation, guiding our community toward more adaptive and interpretable models for video understanding.

15:00–16:00

Coffee Break and Poster Presentation

Session 2-4

16:00–16:20

Breaking the Guardrails: Stealthy Bit-Flip and Jailbreak Attacks on Large Language Models

Prof. Yu Li (Zhejiang University)

Abstract

Abstract:

Large language models (LLMs) are being rapidly deployed in safety-critical applications, yet their security vulnerabilities are still poorly understood. In this work, I present two recent studies that reveal fundamental weaknesses in state-of-the-art LLM defenses. The first, SilentStriker (NeurIPS 2025), introduces a new class of stealthy bit-flip attacks that manipulate model weights at the hardware level. These attacks remain difficult to detect while significantly degrading performance, highlighting fragility in the underlying infrastructure of LLMs. The second, One Model Transfer to All (ICLR 2025), develops an automated framework for generating jailbreak prompts that generalize across diverse LLMs, showing how alignment-only safeguards can be bypassed through transferable adversarial inputs. Together, these works demonstrate that both low-level faults and high-level adversarial prompting can undermine current safeguards, raising urgent questions about the robustness of modern AI systems. I conclude with directions toward resilient architectures and adaptive defenses that aim to ensure the reliability and trustworthiness of next-generation AI.


16:20–16:40

Reliable Prediction Sets for Every Class: Conformal Prediction for Many-Class and Imbalanced Settings

Tiffany Ding (University of California, Berkeley)

Abstract

Abstract:

AI has developed to the point where predictive models are good enough to use but are still imperfect, motivating the need for uncertainty quantification methods that enable humans to better incorporate model predictions in their decision-making This is especially true for modern-day classification tasks, where the number of classes is large and the distribution can be highly skewed. We tackle the problem of creating useful prediction sets in such settings. We distill this to producing prediction sets that are not too large but have high probability of containing the true label, conditioned on the true label, which is known as class-conditional coverage. First, we propose a method for achieving approximate class-conditional coverage in the many class settings, provided that the class distribution is relatively balanced [1]. Second, we propose a method that achieves “as much class-conditional coverage as possible” at a given set size, which works well even with extremely imbalanced class distributions that result in some classes very few (possibly zero) calibration examples [2]. Together, these methods enable practical yet theoretically grounded uncertainty quantification for a range of real-world classification tasks.

16:40–17:00

Generative Modeling for Scientific Discovery

Minkai Xu (Stanford University)

Abstract

Abstract:

With the rapid progress of computation power and large-scale datasets, Generative AI has emerged with a promising direction for automatic content creation and scientific discovery. However, fundamental challenges exist in modeling the complex distribution with intricate structures and interdependencies. In this talk, I will introduce my representative research in innovative generative models with structured Markov processes. Specifically, I will cover 3D geometric diffusion models with applications in scientific discovery, and large discrete diffusion models for language modeling with efficient parallel sampling acceleration. Finally, I will discuss future directions in my research line, such as developing principled post-training and guidance techniques on the structured generative processes.

Day 3: Feb 11, 2026

Session 2-1

  08:30 – 09:00 

Coffee & Registration

  09:00– 09:40

Keynote speaker

Prof. Juergen Schmidhuber (Co-Chair, Center of Excellence for Generative AI, KAUST)

  09:40 – 10:00

The Control Crisis of Multimodal Large Language Models

Yajie Zhou (University of Oxford / Google)

Abstract

Abstract:

When vision modality is integrated into LLMs, Hallucination undermining factual integrity and Jailbreaking compromising safety alignment pose threats to trustworthiness in MLLMs. It demands a holistic defense strategy integrating both proactive control and reactive remediation. My research addresses this by first developing techniques for Inference Steering, which provides a proactive, real-time control mechanism to guide MLLM generation away from ungrounded content and suppress unsafe responses by adjusting latent representations during the forward pass. Complementing this, we are pioneering advanced methods for Multimodal Unlearning, which serves as the reactive, surgical solution to permanently remove the root causes of systemic errors, such as specific hallucinatory data patterns or exploited adversarial vulnerabilities, by modifying the model's parameters post-deployment.


10:00–11:00

Coffee break & Poster Presentation

Session 3-2

11:00–11:20

How Human Are Today’s AI Systems? A Multidimensional Cognitive Assessment of AI Models

Jen-Tse Huang (Johns Hopkins University)

Abstract

Abstract:

Large language models (LLMs) are increasingly treated as social, cognitive, and decision-making agents, but to what extent do they actually resemble minds? In this talk, I will synthesize five lines of empirical evidence probing LLMs and multimodal LLMs (MLLMs) through the lens of psychology and cognitive science. First, using PsychoBench, we profile models across clinically inspired dimensions such as personality, motivation, affect, and interpersonal style, to ask: “Who is ChatGPT?” Second, with EmotionBench, we test whether models exhibit human-like empathy and emotion appraisal across everyday situations, revealing partial alignment but systematic gaps in emotional coherence and generalization. Third, we examine vision-language models with VisFactor, a cognitive battery of fundamental visual abilities. Despite strong benchmark reputations, leading MLLMs fail at basic spatial reasoning and perceptual organization that humans find trivial. Fourth, we evaluate strategic decision-making through GAMA-Bench, a multi-agent game-theoretic framework, showing that some models can act robustly but struggle to generalize strategies across settings. Finally, we argue that LLMs lack a core ingredient of human cognition: working memory. Across tasks requiring active maintenance and manipulation of internal state, models display inconsistency and self-contradiction. Together, these results challenge assumptions that scale alone yields human-like cognition, and instead map where current AI systems approximate and fundamentally diverge from human psychological function.

11:20–11:40

 Building Open Foundations for 3D Medical Vision–Language Models

Ibrahim Ethem Hamamci (University of Zurich / ETH AI Center)

Abstract

Abstract

Progress in medical AI depends not only on algorithms but also on open, high-quality datasets and shared benchmarks that enable transparent, reproducible, and clinically relevant research. My work focuses on building such open foundations for 3D medical imaging, spanning data creation, model development, and community benchmarking.

I led the development of CT-RATE, a large-scale multimodal chest CT dataset comprising 25,000 volumes (11 TB) paired with radiology reports, now downloaded by over 3,000 researchers and cited 150+ times in 2025. CT-RATE has served as the basis for several open-source models: GenerateCT (ECCV 2024) for text-conditional 3D image generation, CT2Rep (MICCAI 2024) for report generation, and CT-CLIP and CT-CHAT (Nature Biomedical Engineering) for 3D vision-language understanding, followed by BTB3D (NeurIPS 2025) for cross-body pretraining.

To strengthen community benchmarking, I co-organized the VLM3D Challenge at MICCAI 2025 and ICCV 2025, attracting 450 participants and nearly 100 model submissions across four tasks—illustrating the power of open collaboration in accelerating medical imaging research.

Currently, I am extending CT-CHAT into a generalist 3D AI assistant through improved slice-level tokenization, multi-task learning (generation, segmentation, and reporting), and expansion to a multi-anatomy dataset of over 300,000 CT and 200,000 MRI volumes.

Through this work, I aim to emphasize that the future of medical AI will be built on open datasets, transparent models, and shared benchmarks, the true foundations for trustworthy and equitable clinical innovation.

11:40–12:00

Efficient Visual Generation with Diffusion Models and Acceleration

Junsong Chen (The University of Hong Kong)

Abstract

Abstract:

This talk will present my research on advancing generative diffusion models to unprecedented levels of efficiency, deployment capability, and consumer-facing readiness (AIPC). As a first-year Ph.D. student with 2k+ citations and 10k+ total GitHub stars, my work demonstrates a rare combination of academic rigor and open-source impact.

I will introduce the systematic evolution of the SANA family of models, designed to achieve SOTA performance while fundamentally addressing the bottlenecks of high latency and resource cost. The journey begins with the initial SANA and SANA 1.5 models, which pioneered methods for extremely high-compression VAEs, enabling the initial possibility of real-time generation.

The breakthrough efficiency is fully realized in SANA-Sprint for text-to-image (T2I), which achieved a revolutionary 1-step inference and 0.1-second latency for 1024x1024 images, delivering a 10× speedup over competitors.

For long video generation, SANA-Video solves the scalability challenge. By inventing the Linear DiT architecture and a novel Constant-Memory KV Cache, SANA-Video enables the efficient synthesis of minute-long, 720p videos. This innovation results in a 16× speedup over leading small diffusion models.

Ultimately, this presentation details how these architectural innovations collectively make high-fidelity generative AI not just powerful, but practical, ready for deployment in real-time, consumer applications.

12:00–14:00

Lunch break

Session 3-3

14:00–14:20

Research Overview at KAUST

 Prof. Maurizio Filippone (KAUST)

14:20–14:40

Efficient Systems and Compilers for Generative AI

Prof. Xupeng Miao (Purdue University)

Abstract

Abstract:

In the rapidly evolving landscape of generative artificial intelligence (AI), the efficiency of underlying systems and compilers plays a crucial role in enabling scalable, sustainable, and accessible AI technologies. This talk will provide participants with a comprehensive understanding of the state-of-the-art techniques in the design and implementation of systems and compilers that optimize the performance of generative AI models, especially for large language models (LLMs).


14:40–15:00

Unifying Vision and Language for Robotic Intelligence: From Robotic Manipulation to Surgical Assistance

Prof. Baoru Huang (University of Liverpool)

Abstract

Abstract:

In this talk, I will present my research at the intersection of computer vision, medical AI, and robotics, focusing on how intelligent visual guidance systems can bridge the gap between human expertise and robotic perception. My overarching goal is to develop AI systems that not only perceive and interpret complex environments but also reason and act with clinical or physical precision.

The first part of the talk will focus on surgical vision for image-guided intervention, where I integrate preoperative imaging, intraoperative sensing, and advanced 3D vision algorithms to achieve real-time tumor localization and navigation in minimally invasive cancer surgery. This work introduces one of the first systems integrating radioactive tracers with AI-driven visual feedback, allowing surgeons to “see the unseen” and make safer, more precise intraoperative decisions—advancing the field toward autonomous robotic-assisted surgery.

Bridging surgical robotics and everyday robotics, in the second part, I will expand beyond the operating room and discuss how Large Vision-Language Models (LVLMs) can enable general-purpose robots to learn perception and action through multimodal reasoning and linguistic instruction. My team developed one of the largest language-driven robotic manipulation datasets to date (over 1M samples), enabling scalable training and evaluation of LVLMs for language-conditioned grasping and manipulation.

The talk concludes with a forward-looking discussion on trustworthy, embodied AI, exploring how insights from surgical and general-purpose robotics can together inspire foundation models for robotic intelligence. These efforts aim to unify perception, reasoning, and action across domains—building the next generation of intelligent agents capable of transforming healthcare, industry, and human–robot collaboration.

15:00–15:20

Closing Remarks

Prof. Omar Knio (KAUST)

SPEAKERS & ORGANIZERS

  • Organizers
  • Scientific Committee
  • Speakers
Organizers

MAURIZIO FILIPPONE

Associate Professor, Statistics

KAUST


FRANCESCO ORABONA

Associate Professor, Computer Science

KAUST

SILVIO GIANCOLA

Research Scientist

KAUST

KAREN SANCHEZ

Postdoctoral Research Fellow

KAUST


DAVID PUGH

Instructional Assistant Professor, Computer Science

KAUST

LINDA REMIL

KAUST

LILIANA RIVERA

KAUST

Scientific Committee

Alessandro Conacchia


Chadi Helwe

Fida Thoker

Firas Laakom

Guozhong Li

Sameh Abdullah

Shuqi Li

Yimeng Chen


Carlos Hinojosa


Tengwei Song



Emanuele Ricco

Kaja Gruntkowska

Yulian Wu

Jianghui Wang  

Speakers

Bernard Ghanem

Professor, Electrical and Computer Engineering, KAUST

Juergen Schmidhuber

Professor, Computer Science, KAUST

Maurizio Filippone

Associate Professor, Statistics, KAUST


Francesco Orabona

Associate Professor, Computer Science, KAUST

Yaser Al Onaizan

Deputy CEO & AI Products President-Humain

Antonios Anastasopoulos

George Mason University

Badr AlKhamissi

EPFL

Baoru Huang

University of Liverpool

Cong Fang

Peking University

Enxin Song

Zhejiang University

Enzo Ferrante

University of Buenos Aires

Erdem Biyik

University of Southern California, Department of Computer Science

Florian Hübler

ETH Zurich

Ibrahim EthemHamamci

University of Zurich

Jen-Tse Huang

Johns Hopkins University

Jindong Gu

University of Oxford

Junsong Chen

The University of Hong Kong

Minkai Xu

Stanford University

Muhammad Maaz

Mohamed Bin Zayed University of AI

Sahar Abdelnabi

ELLIS Institute Tübingen and the Max-Planck Institute for Intelligent Systems

Shengran Hu

University of British Columbia

Tiffany Ding

University of California, Berkeley

Xupeng Miao

Purdue University

Yajie Zhou

University of Maryland, College Park

Yanqiao Zhu

University of California, Los Angeles

Youssef Allouah

Stanford University

Yu Li

Zhejiang University

Yu-Lin Tsai

UC Berkeley

Zicong Hong

École Polytechnique Fédérale de Lausanne (EPFL)

Yang Yue

Tsinghua University

KAUST Centers of Excellence

KAUST Launches Four Pioneering Centers of Excellence to Address Key National and International Priorities

Generative AI

Renewable Energy and Storage Technologies

Smart Health

Sustainable Food Security

KAUST CORE LABS


KAUST hosts a wide range of sophisticated instruments and world-class facilities that students can access, including the Prototyping and Product Development Core Lab, and laboratories involving robotics and embedded systems, sensors, intelligent autonomous systems and biotechnology. Specific labs will be identified based on the curriculum and individual projects.


A NEW ERA FOR KAUST

Our unrelenting commitment to research, innovation and talent has seen KAUST establish itself as one of the leading research universities in the world, ranking #1 for citations per faculty globally, with a reputation for impact-driven research that contributes to the betterment of the world. This new era of KAUST builds on our many successes, achievements and strong foundations, and our new strategy represents an evolution that brings us closer to the interests of the Kingdom.


CONTACT US

King Abdullah University of Science and Technology (KAUST)

4700 King Abdullah University of Science and Technology

Thuwal 23955-6900

Kingdom of Saudi Arabia

Follow us on Social media: