Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach

Vittoriano Muttillo; Claudio Di Sipio; Riccardo Rubei; Luca Berardinelli; MohammadHadi Dehghani

doi:10.1145/3691620.3695058

What is it about?

This work introduces a new method to automatically create synthetic data for Model-Driven Engineering (MDE). When engineers design complex systems, their modeling actions are usually recorded and used to train intelligent assistants that can suggest improvements or next steps. However, collecting such real data is difficult due to time, privacy, and industrial constraints. Our study shows how Large Language Models (LLMs), such as GPT-4 or LLaMA3, can generate artificial but realistic “modeling traces” that simulate human behavior. These synthetic traces can then be used to train intelligent modeling assistants like MORGAN, improving their performance even when real data are missing. The results demonstrate that LLMs can produce high-quality synthetic traces that closely resemble those created by human designers.

Photo by Aerps.com on Unsplash

Why is it important?

This research addresses a key limitation in both academic and industrial contexts: the lack of high-quality modeling data for training intelligent assistants. The proposed approach is timely and innovative because it leverages generative AI to produce reliable data while avoiding privacy and intellectual property issues. It significantly reduces the time and cost of data collection and enables model-based tools to learn even when real datasets are unavailable. By bridging the gap between human and machine-generated modeling operations, this work paves the way for faster, safer, and more scalable automation in software engineering.

Perspectives

From my personal perspective, this study represents an important step toward merging traditional modeling practices with generative AI. Working with different LLMs revealed how effectively they can replicate human reasoning in modeling contexts when properly guided. I believe this line of research can inspire the MDE community to explore AI-driven data synthesis and integrate generative techniques into next-generation modeling environments. Ultimately, this approach could make modeling tools more autonomous, adaptive, and accessible to both experts and newcomers.
Vittoriano Muttillo
Universita degli Studi di Teramo

This page is a summary of: Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach, October 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3691620.3695058.
You can read the full text:

Read

Contributors

The following have contributed to this page

Vittoriano Muttillo
Universita degli Studi di Teramo

Generating Synthetic Modeling Data with Large Language Models

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Generating Synthetic Modeling Data with Large Language Models

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management