The Data-Highway Project is an initiative designed to enhance the tourism and cultural heritage by leveraging Linked Open Data, NGSI-LD, and AI-driven technologies. The project aims to create a scalable, interoperable, and intelligent data infrastructure, enabling seamless access, sharing, and reuse of Open Data.
To optimize data extraction and management, the project explores Retrieval-Augmented Generation (RAG) with Knowledge Graphs, i.e. structuring data using vectorized text and graph-based entity relationships to improve retrieval and accuracy and invoke with customized prompts, hence utilizing tailored LLM prompts to extract structured data in NGSI-LD format, enhancing interoperability.
The study evaluates multiple AI models, including GPT, DeepSeek, Llama, and Phi, analyzing their effectiveness in structured data generation and compliance with NGSI-LD standards.
Some of the issues include: poor quality and disorganized structured datasets and lack of proper extraction when working with natural language on unstructured ones. Both situations pose challenges for AI models. To address these, the system prompt was refined, the context window was increased, and low temperature settings were maintained to minimize hallucinations and prevent the generation of inaccurate data.
By integrating AI and semantic data structuring, the Data-Highway Project lays the foundation for a smarter, more connected digital ecosystem, driving innovation in cultural and tourism experiences.