What is it about?
Websites are becoming increasingly complex, making it difficult for current tools to fully understand their structure. Our research introduces WebClasSeg-25, a new dataset that breaks webpages into meaningful sections. It looks at both the layout and the text content, and classifies each section by its function (like header, navigation, or main content) and its level of digital maturity (from simple to advanced).
Featured Image
Photo by Growtika on Unsplash
Why is it important?
Understanding website structure is essential for improving search, user experience, and web-based AI tools. Existing datasets often miss key design aspects of modern websites. WebClasSeg-25 provides a more complete, up-to-date view of today’s web, making it easier for researchers and developers to create smarter tools for analyzing and interacting with websites.
Read the Original
This page is a summary of: WebClasSeg-25: A Dual-Classified Webpage Segmentation Dataset - Integrating Functional and Maturity-Based Analysis, July 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3726302.3730309.
You can read the full text:
Resources
Contributors
The following have contributed to this page







