What is it about?
This work introduces a new method to automatically create synthetic data for Model-Driven Engineering (MDE). When engineers design complex systems, their modeling actions are usually recorded and used to train intelligent assistants that can suggest improvements or next steps. However, collecting such real data is difficult due to time, privacy, and industrial constraints. Our study shows how Large Language Models (LLMs), such as GPT-4 or LLaMA3, can generate artificial but realistic “modeling traces” that simulate human behavior. These synthetic traces can then be used to train intelligent modeling assistants like MORGAN, improving their performance even when real data are missing. The results demonstrate that LLMs can produce high-quality synthetic traces that closely resemble those created by human designers.
Featured Image
Photo by Aerps.com on Unsplash
Why is it important?
This research addresses a key limitation in both academic and industrial contexts: the lack of high-quality modeling data for training intelligent assistants. The proposed approach is timely and innovative because it leverages generative AI to produce reliable data while avoiding privacy and intellectual property issues. It significantly reduces the time and cost of data collection and enables model-based tools to learn even when real datasets are unavailable. By bridging the gap between human and machine-generated modeling operations, this work paves the way for faster, safer, and more scalable automation in software engineering.
Perspectives
From my personal perspective, this study represents an important step toward merging traditional modeling practices with generative AI. Working with different LLMs revealed how effectively they can replicate human reasoning in modeling contexts when properly guided. I believe this line of research can inspire the MDE community to explore AI-driven data synthesis and integrate generative techniques into next-generation modeling environments. Ultimately, this approach could make modeling tools more autonomous, adaptive, and accessible to both experts and newcomers.
Vittoriano Muttillo
Universita degli Studi di Teramo
Read the Original
This page is a summary of: Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach, October 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3691620.3695058.
You can read the full text:
Contributors
The following have contributed to this page







