Motivation
Translating legal mandates of the EU AI Act into technical workflows requires shifting from manual paperwork to digital governance. By mapping documentation requirements to formal semantic concepts, organizations benefit from a structured, machine-interpretable framework. This ensures machine-readability by transforming legal text into metadata artefacts. This enables automation (such as continuous compliance tracking, algorithmic auditing, and real-time policy enforcement across the AI lifecycle), as well as interoperability (by establishing a unified, vendor-agnostic vocabulary that allows diverse development pipelines, enterprise tools, and regulatory platforms to exchange data without losing context).
Research Questions
Contributions
Previous Work
Scope
AI Act
Documentation Formats
Methodology
Vocabulary Requirements Specification
Requirements regarding documentation of high-risk AI systems are extracted from the AI Act.
Gap Analysis
Extracted requirements are compared against existing machine-readable and non-machine-readable documentation resources.
Concept Creation
Concepts created in an iterative process to allow broad coverage of gaps while ensuring an adequate fit into existing DPV structures.
Vocabulary Publication
Concepts proposed for publication to the DPVCG.
Documentation Requirements
requirements identified
Documentation requirements have been grouped regarding whether they are of a technical or organisational nature, as well as regarding their level based on the taxonomy of transparency in AI in ISO 12792.
Coverage by Documentation Format
Documentation requirements have been analysed regarding their coverage by different existing documentation formats.
For each documentation requirement and and each documentation format, the the coverage has been assessed qualitatively on a discrete scale from 0 to 2, where each score represents the following:
The following table shows how many documentation requirements received a score of 0, 1 and 2 for each documentation format:
| Format | Score 0 | Score 1 | Score 2 | Avg. Score |
|---|---|---|---|---|
| Datasheets | 225 | 12 | 48 | |
| Model Cards | 162 | 100 | 23 | |
| ISO 42001 | 94 | 79 | 112 | |
| Croissant | 248 | 10 | 27 | |
| MLDCAT-AP | 208 | 63 | 14 | |
| DPV | 101 | 107 | 77 |
This makes obvious that none of the existing documentation formats fully cover all the documentation required by the AI Act. ISO 42001 comes closest, but is not a machine-readable resource. Among semantic formats (MLDAT-AP, DPV, Croissant), DPV scores highest, and is therefore selected as the framework for the development of concepts to address this gap.
New Concepts Proposed to DPV
proposed
dpv: Data Privacy Vocabulary core 15 classes
dpv:DataQualityStatus
dpv:StatushasDataQualityStatusdpv:DataAvailabilityAssessmentArt. 10(2)(e)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataAvailabilityStatusArt. 10(2)(e)
dpv:DataQualityStatusdpv:DataQuantityAssessmentArt. 10(2)(e)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataQuantityStatusArt. 10(2)(e)
dpv:DataQualityStatusdpv:DataSuitabilityAssessmentArt. 10(2)(e)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataSuitabilityStatusArt. 10(2)(e)
dpv:DataQualityStatusdpv:DataRelevanceAssessmentArt. 10(3)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataRelevanceStatusArt. 10(3)
dpv:DataQualityStatusdpv:DataContextualSuitabilityAssessmentArt. 10(4)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataContextualSuitabilityStatusArt. 10(4)
dpv:DataQualityStatusdpv:DataCorrectnessAssessmentArt. 10(3)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataCorrectnessStatusArt. 10(3)
dpv:DataQualityStatusdpv:DataCompletenessAssessmentArt. 10(3)
dpv:DataQualityAssessmentdpv:hasAssessmentdpv:DataCompletenessStatusArt. 10(3)
dpv:DataQualityStatuseu-aiact: EU AI Act extension 8 classes
eu-aiact:TrainingForDeployerArt. 9(5)(c)
dpv:OrganisationalMeasuredpv:hasOrganisationalMeasureeu-aiact:EnsuringDeployerSuitabilityArt. 9(5)
dpv:OrganisationalMeasureeu-aiact:DataGapsPreventingComplianceArt. 10(2)(h)
risk:LegalComplianceRiskeu-aiact:CommunicationWithAuthorityArt. 17(1)(j)
dpv:GovernanceProcedureseu-aiact:HumanOversightManagementArt. 14(4)
dpv:OrganisationalMeasureeu-aiact:A10-5Art. 10(5)
dpv:LegalBasiseu-aiact:SpecialCategoryBiasExemptionAssessmentArt. 10(5)
dpv:Assessmenteu-aiact:SpecialCategoryBiasExemptionStatusArt. 10(5)
ai: AI extension 5 classes · 1 property
ai:ModelTesting
dpv:TechnicalMeasuredpv:hasTechnicalMeasureai:AdversarialModelTestingAnnex XI §2(2)
ai:ModelTestingdpv:hasTechnicalMeasureai:ParameterCountAnnex XI §1(1)(d); Annex XII(1)(f)
ai:hasParameterCountai:DataCurationAnnex XI §1(2)(c)
dpv:DataGovernancedpv:hasTechnicalOrganisationalMeasureai:DataSelectionAnnex XI §1(2)(c)
ai:DataOperationai:hasParameterCount
ai:Modelai:ParameterCounttech: Technology extension 4 classes · 9 properties
tech:FirmwareAnnex IV(1)(c)
dpv:Technologytech:hasFirmwaretech:UpdateRequirementsAnnex IV(1)(c)
tech:Instructionstech:hasUpdateRequirementstech:UserInterfaceAnnex IV(1)(g)
tech:hasUserInterfacetech:EnergyConsumptionAnnex XI §1(2)(e)
eu-aiact:ComputationalResourcetech:hasFirmwareAnnex IV(1)(c)
tech:Firmwaretech:hasVersion
Literaltech:hasUpdateRequirementsAnnex IV(1)(c)
tech:UpdateRequirementstech:hasUserInterfaceAnnex IV(1)(g)
tech:UserInterfacetech:hasKnownEnergyConsumptionAnnex XI §1(2)(e)
tech:EnergyConsumptiontech:hasEstimatedEnergyConsumptionAnnex XI §1(2)(e)
tech:EnergyConsumptiontech:hasModality
tech:Contenttech:hasOperatingInteractionAnnex IV(1)(b)
dpv:Technologytech:hasExpectedLifetimeArt. 13(3)(e)
dpv:Technologydpv:Durationjustifications: Justifications extension 1 class
justifications:AssumptionArt. 10(2)(d)
dpv:JustificationReferences
- Documenting High-Risk AI: A European Regulatory Perspective. Computer, 56(5):18–27. IEEE, 2023. 10.1109/MC.2023.3235712
- Datasheets for Datasets. Communications of the ACM, 64(12):86–92, 2021. 10.1145/3458723
- Model Cards for Model Reporting. In: Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '19). ACM, 2019. 10.1145/3287560.3287596
- ISO/IEC 42001:2023 — Information Technology: Artificial Intelligence Management System. International Organization for Standardization, Geneva, 2023. iso.org/standard/42001
- Croissant: A Metadata Format for ML-Ready Datasets. In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Datasets and Benchmarks Track, 2024. arXiv:2403.19546
- MLDCAT-AP: Machine Learning DCAT Application Profile, v3.0.0. European Commission Semantic Interoperability Community (SEMIC), 2025. semiceu.github.io/MLDCAT-AP
- Data Privacy Vocabulary (DPV) — Version 2.0. In: The Semantic Web — ISWC 2024. Lecture Notes in Computer Science, vol 15233. Springer, 2024. 10.1007/978-3-031-77847-6_10
- Regulation (EU) 2024/1689 of 13 June 2024 laying down harmonised rules on Artificial Intelligence (AI Act). Official Journal of the European Union, L 2024/1689, 2024. EUR-Lex