Contributors: (ordered alphabetically) Daniel Doherty(Trinity College Dublin),
Delaram Golpayegani(ADAPT Centre, Trinity College Dublin),
Georg P. Krog(Signatu AS),
Harshvardhan J. Pandit(ADAPT Centre, Dublin City University),
Julian Flake(University of Koblenz),
Scott Kellum(Typetura).
NOTE: The affiliations are informative, do not represent formal endorsements, and may be outdated as this list is generated automatically from existing data.
The AI extension extends the [[[DPV]]] and its [[[TECH]]] extension to represent AI techniques, applications, risks, and mitigations. The namespace for terms in ai is https://www.w3id.org/dpv/ai#. The suggested prefix for the namespace is ai. The AI vocabulary and its documentation are available on GitHub.
DPV Specifications: The [[DPV]] is the core specification within the DPV family, with the following extensions: Personal Data [[PD]], Locations [[LOC]], Risk Management [[RISK]], Technology [[TECH]] and [[AI]], [[JUSTIFICATIONS]], [[SECTOR]] specific extensions, and [[LEGAL]] extensions modelling specific jurisdictions and regulations. A [[PRIMER]] introduces the concepts and modelling of DPV specifications, and [[GUIDES]] describe application of DPV for specific applications and use-cases. The Search Index page provides a searchable hierarchy of all concepts. The Data Privacy Vocabularies and Controls Community Group (DPVCG) develops and manages these specifications through GitHub. For meetings, see the DPVCG calendar.
Contributing: The DPVCG welcomes participation to improve the DPV and associated resources, including expansion or refinement of concepts, requesting information and applications, and addressing open issues. See contributing guide for further information.
Core Concepts
Overview of AI extension
The [[[AI]]] extension further extends the [[TECH]] extension to represent concepts specifically associated with development, use, and operation of AI, and provides:
Techniques such as machine learning and natural language programming
Capabilities such as image recognition and text generation
AI Systems and Models such as expert systems, or general purpose AI models (GPAI)
Data such as for training, testing, and validation
Development Phases such as training
Risks such as data poisoning, statistical noise and bias, etc.
Risk Measures to address the AI specific risks
Lifecycle such as data collection, training, fine-tuning, etc.
Documentation such as Data Sheets and Model Cards
Actors such as AI Developer and AI Deployer
Status associated with AI development
The AI extension is created based on the following sources:
Artificial Intelligence (AI) is a category of Technology that exhibits or satisfies specific behaviour. While the exact definition of what constitutes 'AI' continues to be a subject of debate and regulation, we focus on the generally understood use of 'AI technologies' and thereby provide this extension to represent information about them in terms of developing and using them, and describing other relevant information about AI such as the specific risks involved and relevant mitigations and measures, documentation, involved data, and a description of the underlying technology itself in terms of specific operations and functions. As there is no consistent vocabulary or standard which is used uniformly within this domain, the concepts provided in this extension represent the specific way the DPVCG has chosen to represent information about AI technologies.
The AI extension is based on the modelling of technologies in DPV vocabularies. For this reason, it extends the [[TECH]] extension, and only provides AI-specific concepts in this extension. For example, the entity that is the developer of an AI system is represented by the same concept as the developer of any technology through the `tech:Developer` concept. If and when we identify AI-specific actors and roles, those will be defined in this extension by extending the relevant DPV and TECH entities.
Conceptual Model
Overview of the conceptual model for how AI is described as a technology in DPV and AI extension. The notes provide an example showing how the process of unlocking a phone for identity authentication is described using DPV, and the details of how this functions at a technical level is provided through the concepts in AI extension.
The concept [=AI=] and its corresponding relation [=hasAI=] represent the broad and generic concept of 'AI' and its use in different contexts. For example, AI might be used to refer to a specific technical algorithm (e.g. conventional computer science use of AI), or a way of automating specific tasks (e.g. business process use of AI), or to describe a process where AI is used in part (e.g. marketing use of AI). To explicitly and accurately describe what is involved in 'AI', we provide further granular additional concepts based on a 'three-layer approach' consisting of [=Technique=] and [=Capability=] for describing the technical implementation and goals, and 'Purpose' (represented by `dpv:Purpose`) for describing the broader aim of the process.
[=Technique=] represents the underlying 'technique' or 'algorithm', for example [=MachineLearning=] or its specific forms [=NeuralNetwork=] and [=SupervisedLearning=]. It is a technical detail that does not have a specific goal or purpose in the implementation, and which is applied in different contexts to achieve different outcomes.
[=Capability=] refers to the use of a technique to achieve or perform a (technical) goal or objective. It describes what the technology is 'capable' of doing in terms of a 'technical goal'. For example, [=FaceRecognition=] is a capability for using some underlying [=Technique=] to achieve its goal of recognising faces. However, by itself, we still don't know why facial recognition is being used or developed within the process. This is where `dpv:Purpose` then describes the broader goal or aim for not just the use of AI but also other contextual information such as data, people, entities - such as to state this is being done for identity verification and enforcement of security.
The separation of concepts in this manner also allows for an efficient and accurate representation of how AI technologies are developed and applied in practice. For example, _Entity1_ develops a algorithmic framework to ingest data and perform some statistical operations on it - this is represented as a [=Technique=]. This framework is then taken by _Entity2_ who uses it towards generating content - this is represented as a [=Capability=]. It then puts this on the market as a product. _Entity3_ then uses this product to provide a service to its customers in terms of recommendations - this is represented as a `dpv:Purpose`. In its knowledge graph, _Entity3_ records that it uses a technology with the relevant AI capability, while the knowledge graph of _Entity2_ represents that it uses the framework produced by _Entity1_.
Techniques
[=Technique=] represents the underlying technical implementation, and is associated using [=hasTechnique=]. It represents the lowest level of technical details within the conceptual model used in this extension to describe 'AI technology'. By itself, a technique is not sufficient to describe what the AI technology is being used for, but it is useful to express how the AI technology functions.
An implementation of AI technology can be developed only based on a technique - for example as a library or as a framework that can be reused by others. Therefore, a technique can act as a component of a larger AI system where it represents a particular method for implementing something. A technique can also involve the use of other techniques in a composite or combined manner.
ai:GeneticAlgorithm: Algorithm which simulates natural selection by creating and evolving a population of individuals (solutions) for optimization problems
go to full definition
ai:KnowledgeTechnique: Techniques based on the use of knowledge bases
go to full definition
ai:InductiveProgramming: An algorithm or program featuring recursive calls or repetition control structures
go to full definition
ai:KnowledgeRepresentation: Encoding knowledge in a formal language
go to full definition
ai:SymbolicReasoning: Reasoning based on the knowledge encoded in a formal language
go to full definition
ai:MachineLearning: Process of optimizing model parameters through computational techniques, such that the model's behaviour reflects the data or experience
go to full definition
ai:DeepLearning: Approach to creating rich hierarchical representations through the training of neural networks with many hidden layers
go to full definition
ai:NeuralNetwork: Network of one or more layers of neurons connected by weighted links with adjustable weights, which takes input data and produces an output
go to full definition
ai:ConvolutionalNeuralNetwork: Feed forward neural network using convolution in at least one of its layers
go to full definition
ai:FeedForwardNeuralNetwork: Neural network where information is fed from the input layer to the output layer in one direction only
go to full definition
ai:LongShortTermMemory: Type of recurrent neural network that processes sequential data with a satisfactory performance for both long and short span dependencies
go to full definition
ai:RecurrentNeuralNetwork: Neural network in which outputs from both the previous layer and the previous processing step are fed into the current layer
go to full definition
ai:SemiSupervisedLearning: Machine learning that makes use of both labelled and unlabelled data during training
go to full definition
ai:SupervisedLearning: Machine learning that makes only use of labelled data during training
go to full definition
ai:SupportVectorMachine: A machine learning algorithm that finds decision boundaries with maximal margins
go to full definition
ai:UnsupervisedLearning: Machine learning that makes only use of unlabelled data during training
go to full definition
ai:BayesianNetwork: Probabilistic technique that uses Bayesian inference for probability computations using a directed acyclic graph
go to full definition
ai:BayesianOptimisation: Refers to Bayesian optimisation technique
go to full definition
ai:DecisionTree: Technique for which inference is encoded as paths from the root to a leaf node in a tree structure
go to full definition
[=Capability=] represents the use of a technique to achieve some technical goal or objective, and is associated using [=hasCapability=]. It represents the middle level of technical details within the conceptual model used in this extension to describe 'AI technology'. By itself, a capability it not useful to describe what the end-goal of using the AI technology is, but it is useful to describe what the AI technology is used for in context of achieving an end-goal.
An implementation of AI technology can be developed with a capability - for example as a service or as a software, which can be used in a stand-alone manner or be integrated in to a larger AI system. Therefore, AI capabilities can occur as both components and systems, and be involved in processes directly or indirectly in this manner. A capability can also involve the use of other capabilities in a composite or combined manner.
ai:AudioCapability: Capabilities related to the processing and generation of audio
go to full definition
ai:FaceRecognition: Capability involving automatic pattern recognition for comparing stored images of human faces with the image of an actual face, indicating any matching, if it exists, and any data, if they exist, identifying the person to whom the face belongs
go to full definition
ai:GestureRecognition: Capability for recognising human gestures
go to full definition
ai:ImageRecognition: Capability for image classification process that classifies object(s), pattern(s) or concept(s) in an image
go to full definition
ai:HumanOrientedCapability: Capabilities that are inherently about humans or oriented towards human characteristics and activities
go to full definition
ai:BehaviourAnalysis: Capability of a system in analysing people's behaviour
go to full definition
ai:BiometricCapability: Capability involving processing of biometric data or related to biometrics
go to full definition
ai:BiometricCategorisation: Capability involving assigning natural persons to specific categories based on their biometric data
go to full definition
ai:BiometricIdentification: Capability involving automated recognition of physical, physiological and behavioural human features such as the face, eye movement, body shape, voice, prosody, gait, posture, heart rate, blood pressure, odour, keystrokes characteristics, for the purpose of establishing an individual’s identity by comparing biometric data of that individual to stored biometric data of individuals in a reference database, irrespective of whether the individual has given its consent or not
go to full definition
ai:LocalBiometricIdentification: Capability involving biometric identification carried out locally
go to full definition
ai:PostTimeBiometricIdentification: Capability involving biometric identification carried out later or not in real-time or not-instaneously
go to full definition
ai:RealTimeBiometricIdentification: Capability involving biometric identification carried out in real-time or instataneously
go to full definition
ai:RemoteBiometricIdentification: Capability involving biometric identification carried out remotely
go to full definition
ai:FaceRecognition: Capability involving automatic pattern recognition for comparing stored images of human faces with the image of an actual face, indicating any matching, if it exists, and any data, if they exist, identifying the person to whom the face belongs
go to full definition
ai:GestureRecognition: Capability for recognising human gestures
go to full definition
ai:EmotionRecognition: Capability for identifying and categorizing emotions expressed in a piece of text, speech, video or image or combination thereof
go to full definition
ai:BiometricEmotionRecognition: Capability for recognisting emtions based on biometrics information
go to full definition
ai:LieDetection: Capability to detect lies in the context of human speech, behaviour, information, or activities
go to full definition
ai:PersonalityTraitAnalysis: Capability for determining and analysing people's personality traits
go to full definition
ai:Profiling: Capability where AI is used to construct a profile of an individual (human) or a group of individuals
go to full definition
ai:SentimentAnalysis: Capability for computationally identifying and categorizing opinions expressed in a piece of text, speech or image, to determine a range of feeling such as from positive to negative
go to full definition
ai:SpeakerRecognition: Capability of recognising speaker(s) in audio recordings
go to full definition
ai:SpeechRecognition: Capability of converting a speech signal to a representation of the content of the speech
go to full definition
ai:InformationRetrieval: Capability for retrieving relevant documents or parts of documents from a dataset, typically based on keyword or natural language queries
go to full definition
ai:AutomaticSummarisation: Capability for shortening a portion of content such as text while retaining important semantic information
go to full definition
ai:ContentBasedRetrieval: Capability for retrieval of information using the actual content to identify, select, filter, and provide results
go to full definition
ai:ContextAwareRetrieval: Capability for retrieval of information that takes into account the user's context such as e.g., location, time, device, or activity to provide more relevant results
go to full definition
ai:MultiModalRetrieval: Capability for retrieval of information using multiple modalities such as text, images, audio, and video and supporting cross-modal queries such as taking text as input to search images
go to full definition
ai:MusicInformationRetrieval: Capability for retrieving, analyzing, and categorizing music-related information such as audio files, melodies, or lyrics using audio features, metadata, and user queries
go to full definition
ai:AutomaticSummarisation: Capability for shortening a portion of content such as text while retaining important semantic information
go to full definition
ai:DialogueManagement: Capability for choosing the appropriate next move in a dialogue based on user input, the dialogue history and other contextual knowledge to meet a desired goal
go to full definition
ai:MachineTranslation: Capability for automated translation of text or speech from one natural language to another using a computer system
go to full definition
ai:NamedEntityRecognition: Capability for recognizing and labelling the denotational names of entities and their categories for sequences of words in a stream of text or speech
go to full definition
ai:NaturalLanguageGeneration: Converting data carrying semantics into natural language
go to full definition
ai:PartOfSpeechTagging: Capability for assigning a category (e.g. verb, noun, adjective) to a word based on its grammatical properties
go to full definition
ai:QuestionAnswering: Capability for determining the most appropriate answer to a question provided in natural language
go to full definition
ai:RelationshipExtraction: Capability for identifying relationships among entities mentioned in a text
go to full definition
ai:SentimentAnalysis: Capability for computationally identifying and categorizing opinions expressed in a piece of text, speech or image, to determine a range of feeling such as from positive to negative
go to full definition
AI Systems and Models
[=AISystem=] is defined by ISO/IEC 22989:2023 as "An engineered system that generates outputs such as content, forecasts, recommendations or decisions for a given set of human-defined objectives", and by OECD as "A machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment." Or simply, it represents a 'system' which uses 'AI technologies'.
The property [=hasAISystem=] associates the use of an AI system in context. It is a specialised form of `dpv:isImplementedUsingTechnology` which indicates that a process is being implemented through the use of the stated technology. The components of an AI system can be described through the use of concepts provided in this ([[AI]]) extension as well as through the [[TECH]] extension.
[=Model=] is defined as "a physical, mathematical or otherwise logical representation of a system, entity, phenomenon, process or data involving the use of AI techniques". Or simply, it represents a 'model' of something using 'AI technologies'. The property [=hasModel=] associates a model with a context, such as to indicate a particular AI system utilises the specified model. To specifically represent General-Purpose AI (GPAI) models, the concept [=GPAIModel=] and relation [=hasGPAIModel=] are provided.
The below taxonomy provides additional concepts based on categorisation of [=AISystem=] and [=Model=] in different contexts.
ai:AGI: Type of AI system that addresses a broad range of tasks with a satisfactory level of performance
go to full definition
ai:CognitiveComputing: Category of AI systems that enables people and machines to interact more naturally
go to full definition
ai:ExpertSystem: AI system that accumulates, combines and encapsulates knowledge provided by a human expert or experts in a specific domain to infer solutions to problems
go to full definition
ai:GPAIModel: A model that displays generality in terms of capabilities and potential applications
go to full definition
ai:IntelligentControlSystem: Category of AI systems which implement intelligent control principles for real-world applications by using AI capabilities and techniques
go to full definition
ai:MachineLearningModel: Mathematical construct that generates an inference or prediction based on input data or information
go to full definition
ai:MachineLearningPlatform: Technology platform for developing, deploying, and managing machine learning models and resources
go to full definition
ai:NarrowAI: Type of AI system that is focused on defined tasks to address a specific problem i.e. it addresses a narrow scope of tasks and problems
go to full definition
ai:Robot: An automation system with actuators that performs intended tasks in the physical world, by means of sensing its environment and a software control system
go to full definition
ai:IndustrialRobot: A robot or robotic system for use in industrial automation applications
go to full definition
ai:ServiceRobot: A robot or robotic system in personal use or professional use that performs useful tasks for humans or equipment
go to full definition
ai:SocialRobot: A robot or robotic system with social interaction functions
go to full definition
The DPVCG is interested in the modelling of 'Agents' in the context of AI technologies, including how they interact with systems and other agents, and how they operate through the use of AI technologies. For this, we welcome proposals and participation.
Data
The concept [=Data=] is a broad generic term for describing data involved in the context of AI technology. At the moment, this includes three categories - [=TrainingData=], [=ValidationData=], and [=TestingData=]. The DPVCG welcomes proposals and participation to further enhance this taxonomy.
[=Data=] extends `dpv:Data`, and can be associated with the property [=hasData=] which is a specialised form of `dpv:hasData` to indicate the specified data is involved in context of an AI technology. To specifically indicate the contextual involvement of data within AI development, the properties [=hasTrainingData=], [=hasTestingData=], and [=hasValidationData=] are provided.
To indicate the involvement of personal data, the concept `dpv:PersonalData` should be used along with its relation `dpv:hasPersonalData`. The [[DPV]] taxonomy contains specific concepts to model sensitive data - including that related to confidential and IP, and the [[PD]] extension provides a taxonomy of personal data categories that can be used to indicate involvement in AI technologies.
ai:Data: Data involved in the development and use of an AI system or model
go to full definition
ai:TestingData: Data involved in the testing of an AI system or model
go to full definition
ai:TrainingData: Data involved in the training of an AI system or model
go to full definition
ai:ValidationData: Data involved in the validation of an AI system or model
go to full definition
The concept [=RiskConcept=] in this extension extends dpv:RiskConcept to represent risk sources, risks, consequences, and impacts specific to the development, use, or operation of AI. As with the [[RISK]] extension, the risk concepts presented here can taken on different roles in different use-cases, for example what is a risk source in one scenario could be the consequence in another. The relations risk:hasRiskSource, dpv:hasRisk, dpv:hasConsequence, and dpv:hasImpact are useful to indicate the specific interpretation and role of the AI risk concepts in a scenario.
The AI Risk Concepts are broadly categorised according to the following:
[=DataRisk=] - Risk associated with data used or produced or otherwise involved in the context of AI
[=SecurityAttack=] - Risks or issues associated with security attacks related to AI technologies, models, and systems
[=ModelRisk=] - Risks associated with AI Models
[=AISystemRisk=] - Risks associated with AI Systems
[=UserRisk=] - Risks associated with Users of AI Systems
[=AIBias=] - Bias associated with development, use, or other activities involving an AI technology or system
Data Risks
[=DataRisk=] represent risks associated with the data involved in AI technologies. To represent these risks in the context of the role the data is playing (training, testing, validation), the same set of data risks are expressed for each of the three data categories to accurately represent both the origin and occurrence of the risk.
ai:InputDataRisk: Risks and risk concepts related to input data
go to full definition
ai:InputDataBias: Concept representing input data containing or potentially containing bias
go to full definition
ai:InputDataInaccurate: Concept representing input data being inaccurate
go to full definition
ai:InputDataInappropriate: Concept representing input data being inappropriate
go to full definition
ai:InputDataIncomplete: Concept representing input data being incomplete
go to full definition
ai:InputDataInconsistent: Concept representing input data being inconsistent
go to full definition
ai:InputDataMisclassified: Concept representing input data being misclassified
go to full definition
ai:InputDataMisinterpretation: Concept representing input data being misinterpretation
go to full definition
ai:InputDataNoise: Concept representing input data being noise
go to full definition
ai:InputDataOutdated: Concept representing input data being outdated
go to full definition
ai:InputDataSelectionError: Concept representing an error in input data selection
go to full definition
ai:InputDataSparse: Concept representing input data being sparse
go to full definition
ai:InputDataUnrepresentative: Concept representing input data being unrepresentative
go to full definition
ai:InputDataUnstructured: Concept representing input data being unstructured
go to full definition
ai:InputDataUnverified: Concept representing input data being unverified
go to full definition
ai:TestingDataRisk: Risks and risk concepts related to testing data
go to full definition
ai:TestingDataBias: Concept representing testing data containing or potentially containing bias
go to full definition
ai:TestingDataInaccurate: Concept representing testing data being inaccurate
go to full definition
ai:TestingDataInappropriate: Concept representing testing data being inappropriate
go to full definition
ai:TestingDataIncomplete: Concept representing testing data being incomplete
go to full definition
ai:TestingDataInconsistent: Concept representing testing data being inconsistent
go to full definition
ai:TestingDataMisclassified: Concept representing testing data being misclassified
go to full definition
ai:TestingDataMisinterpretation: Concept representing testing data being misinterpretation
go to full definition
ai:TestingDataNoise: Concept representing testing data being noise
go to full definition
ai:TestingDataOutdated: Concept representing testing data being outdated
go to full definition
ai:TestingDataSelectionError: Concept representing an error in testing data selection
go to full definition
ai:TestingDataSparse: Concept representing testing data being sparse
go to full definition
ai:TestingDataUnrepresentative: Concept representing testing data being unrepresentative
go to full definition
ai:TestingDataUnstructured: Concept representing testing data being unstructured
go to full definition
ai:TestingDataUnverified: Concept representing testing data being unverified
go to full definition
ai:ValidationDataRisk: Risks and risk concepts related to validation data
go to full definition
ai:ValidationDataBias: Concept representing validation data containing or potentially containing bias
go to full definition
ai:ValidationDataInaccurate: Concept representing validation data being inaccurate
go to full definition
ai:ValidationDataInappropriate: Concept representing validation data being inappropriate
go to full definition
ai:ValidationDataIncomplete: Concept representing validation data being incomplete
go to full definition
ai:ValidationDataInconsistent: Concept representing validation data being inconsistent
go to full definition
ai:ValidationDataMisclassified: Concept representing validation data being misclassified
go to full definition
ai:ValidationDataMisinterpretation: Concept representing validation data being misinterpretation
go to full definition
ai:ValidationDataNoise: Concept representing validation data being noise
go to full definition
ai:ValidationDataOutdated: Concept representing validation data being outdated
go to full definition
ai:ValidationDataSelectionError: Concept representing an error in validation data selection
go to full definition
ai:ValidationDataSparse: Concept representing validation data being sparse
go to full definition
ai:ValidationDataUnrepresentative: Concept representing validation data being unrepresentative
go to full definition
ai:ValidationDataUnstructured: Concept representing validation data being unstructured
go to full definition
ai:ValidationDataUnverified: Concept representing validation data being unverified
go to full definition
Bias
The bias concepts represented here are specific to AI, and there are generic bias concepts as well as discrimination impact concepts in [[RISK]] extension. While we are interested in further expanding these concepts, the following external sources should be of interest:
DocBiasO - an ontology-driven approach to support the documentation of bias in data, which has a larger expansive categorisation of bias and provides additional concepts and properties to model specifics such as ethnicities and measurements which are useful in bias measurement and documentation.
ai:AutomationBias: Bias tha occurs due to propensity for humans to favour suggestions from automated decision-making systems and to ignore contradictory information made without automation, even if it is correct
go to full definition
ai:DataBias: Bias that occurs due to unaddressed data properties that lead to AI systems that perform better or worse for different groups
go to full definition
ai:DataAggregationBias: Bias that occurs from aggregating data covering different groups of objects that might have different statistical distributions which introduce bias into the data used to train AI systems
go to full definition
ai:DataLabelsAndLabellingProcessBias: Bias that occurs due to the labelling process itself introducing societal or cognitive biases
go to full definition
ai:DistributedTrainingBias: Bias that occurs due to distributed machine having different sources of data that do not have the same distribution of feature space
go to full definition
ai:MissingFeaturesAndLabelsBias: Bias that occurs when features are missing from individual training samples
go to full definition
ai:NonRepresentativeSamplingBias: Bias that occurs if a dataset is not representative of the intended deployment environment, where the model learns biases based on the ways in which the data is non-representative
go to full definition
ai:EngineeringDecisionBias: Bias that occurs due to machine learning model architectures - encompassing all model specifications, parameters and manually designed features
go to full definition
ai:AlgorithmSelectionBias: Bias that occurs from the selection of machine learning algorithms built into the AI system which introduce unwanted bias in predictions made by the system because the type of algorithm used introduces a variation in the performance of the ML model
go to full definition
ai:FeatureEngineeringBias: Bias that occurs from steps such as encoding, data type conversion, dimensionality reduction and feature selection which are subject to choices made by the AI developer and introduce bias in the ML model
go to full definition
ai:HyperparameterTuningBias: Bias that occurs from hyperparameters defining how the model is structured and which cannot be directly trained from the data like model parameters, where hyperparameters affect the model functioning and accuracy of the model
go to full definition
ai:InformativenessBias: Bias that occurs or some groups, the mapping between inputs present in the data and outputs are more difficult to learn and where a model that only has one feature set available, can be biased against the group whose relationships are difficult to learn from available data
go to full definition
ai:ModelBias: Bias that occurs when ML uses functions like a maximum likelihood estimator to determine parameters, and there is data skew or under-representation present in the data, where the maximum likelihood estimation tends to amplify any underlying bias in the distribution
go to full definition
ai:ModelInteractionBias: Bias that occurs from the structure of a model to create biased predictions
go to full definition
ai:ModelExpressivenessBias: Bias that occurs from the number and nature of parameters in a model as well as the neural network topology which affect the expressiveness of the model and any feature that affects model expressiveness differently across groups
go to full definition
Security Attacks
ai:AdversarialAttack: Inputs designed to cause the model to make a mistake
go to full definition
ai:DataPoisoning: Attack trying to manipulate the training dataset
go to full definition
ai:ModelEvasion: An input, which seems normal for a human but is wrongly classified by ML models
go to full definition
ai:ModelInversion: A type of attack to AI models, in which the access to a model is abused to infer information about the training data
go to full definition
Overview of Risk Concepts
The below table provides suggestions for the role each concept can be used for in the context of risk assessment, and how they can be categorised within the conventional 'CIA security model'. For example, [=AdversarialAttack=] can be used as a risk source (i.e. it can cause further issues to arise), a risk (i.e. it is a risk of concern), or as a consequence (i.e. it can occur due to another risk), and it is classified as affecting 'integrity' in the CIA model.
This table is based on a similar table within the [[RISK]] extension which provides a detailed taxonomy of concepts and the potential roles they can take across use-cases.
The concept [=Measure=] is a specific measure associated with AI technologies to address risks related to AI technologies. While the [[DPV]] and [[RISK]] extension provide relevance and modelling of measures along with detailed taxonomies, this concept is useful to represent the measures developed and specifically used for AI technologies. The DPVCG welcomes proposals and participation to further expand the taxonomy of measures.
Lifecycle
[=LifecycleStage=] models the lifecycle of AI technologies from its inception to deployment, use, and retirement. While we use the term 'lifecycle' here, these stages are also useful in other similar contexts such as 'AI Value Chain' and 'AI Supply Chain'. The AI-specific lifecycle is extended from the concept `tech:LifecycleStage` defined in the [[TECH]] extension to model lifecycle and stages of technologies in general. It can therefore be used with the existing relation `tech:hasLifecycleStage` to denote its applicability or involvement.
ai:ContinuousValidationStage: The stage in the lifecycle where there is continuous learning within the AI system by incremental training on an ongoing basis while the system is running in production
go to full definition
ai:DeploymentStage: The stage in the lifecycle where the AI system is installed, released or configured for deployment and operation in a target environment
go to full definition
ai:DesignStage: The stage in the lifecycle where designs are created for the AI system
go to full definition
ai:DevelopmentStage: The stage in the lifecycle where the development and creation of the system occurs, signalling upon completion that it is ready for verification and validation
go to full definition
ai:InceptionStage: The stage in the lifecycle where inception regarding AI occurs and one or more stakeholders decide to turn an idea into a tangible system
go to full definition
ai:OperationStage: The stage in the lifecycle where an AI system is running and generally available for operations
go to full definition
ai:IncidentMonitoringStage: The stage in the lifecycle where an AI system is actively being monitored for incidents
go to full definition
ai:RepairStage: The stage in the lifecycle where an AI system is being repaired due to suspected or occured incidents
go to full definition
ai:UpdateStage: The stage in the lifecycle where an AI system is being or has been updated
go to full definition
ai:ReevaluationStage: The stage in the lifecycle where the AI system is reevaluated after the operation and monitoring stage based on the operations of the AI system
go to full definition
ai:RetirementStage: The stage in the lifecycle where the AI system is retired and becomes obsolete
go to full definition
ai:DecomissionStage: The stage in the lifecycle where the AI system is being decomissioned as part of retirement
go to full definition
ai:DiscardStage: The stage in the lifecycle where the AI system is being discarded as part of retirement
go to full definition
ai:ReplaceStage: The stage in the lifecycle where the AI system is being replaced as part of retirement
go to full definition
ai:ValidationStage: The stage in the lifecycle where the AI system is validated for requirements and objectives for an intended use or application
go to full definition
ai:VerificationStage: The stage in the lifecycle where the AI system is being verified to satisfy requirements and meet objectives
go to full definition
A technical and scientific field devoted to the engineered system that generates outputs such as content, forecasts, recommendations or decisions for a given set of human-defined objectives
An engineered system that generates outputs such as content, forecasts, recommendations or decisions for a given set of human-defined objectives (ISO/IEC 22989:2023 definition); or A machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment (OECD 2024 definition)
Bias that occurs from the selection of machine learning algorithms built into the AI system which introduce unwanted bias in predictions made by the system because the type of algorithm used introduces a variation in the performance of the ML model
Bias tha occurs due to propensity for humans to favour suggestions from automated decision-making systems and to ignore contradictory information made without automation, even if it is correct
Capability involving automated recognition of physical, physiological and behavioural human features such as the face, eye movement, body shape, voice, prosody, gait, posture, heart rate, blood pressure, odour, keystrokes characteristics, for the purpose of establishing an individual’s identity by comparing biometric data of that individual to stored biometric data of individuals in a reference database, irrespective of whether the individual has given its consent or not
Capability or use of AI to achieve a technical goal or objective
Usage Note
This concept refers to the application of an AI technique to achieve a technical goal or function, and is necessary to distinguish the 'algorithm' (ai:Technique) from the 'application' (ai:Capability) and 'goal' (dpv:Purpose)
Capability for retrieval of information that takes into account the user's context such as e.g., location, time, device, or activity to provide more relevant results
The stage in the lifecycle where there is continuous learning within the AI system by incremental training on an ongoing basis while the system is running in production
Bias that occurs from aggregating data covering different groups of objects that might have different statistical distributions which introduce bias into the data used to train AI systems
The stage in the lifecycle where the development and creation of the system occurs, signalling upon completion that it is ready for verification and validation
Capability for choosing the appropriate next move in a dialogue based on user input, the dialogue history and other contextual knowledge to meet a desired goal
AI system that accumulates, combines and encapsulates knowledge provided by a human expert or experts in a specific domain to infer solutions to problems
Capability involving automatic pattern recognition for comparing stored images of human faces with the image of an actual face, indicating any matching, if it exists, and any data, if they exist, identifying the person to whom the face belongs
Bias that occurs from steps such as encoding, data type conversion, dimensionality reduction and feature selection which are subject to choices made by the AI developer and introduce bias in the ML model
Bias that occurs from hyperparameters defining how the model is structured and which cannot be directly trained from the data like model parameters, where hyperparameters affect the model functioning and accuracy of the model
Bias that occurs or some groups, the mapping between inputs present in the data and outputs are more difficult to learn and where a model that only has one feature set available, can be biased against the group whose relationships are difficult to learn from available data
Usage Note
This can happen when some features are highly informative about one group, while a different set of features is highly informative about another group. If this is the case, then a model that only has one feature set available, can be biased against the group whose relationships are difficult to learn from available data
Bias that occurs when ML uses functions like a maximum likelihood estimator to determine parameters, and there is data skew or under-representation present in the data, where the maximum likelihood estimation tends to amplify any underlying bias in the distribution
Bias that occurs from the number and nature of parameters in a model as well as the neural network topology which affect the expressiveness of the model and any feature that affects model expressiveness differently across groups
A type of attack to AI models, in which the access to a model is abused to infer information about the training data
Usage Note
(HLEG Assessment List for Trustworthy Artificial Intelligence (ALTAI),https://digital-strategy.ec.europa.eu/en/library/assessment-list-trustworthy-artificial-intelligence-altai-self-assessment)
Capability for retrieval of information using multiple modalities such as text, images, audio, and video and supporting cross-modal queries such as taking text as input to search images
Capability for retrieving, analyzing, and categorizing music-related information such as audio files, melodies, or lyrics using audio features, metadata, and user queries
Bias that occurs if a dataset is not representative of the intended deployment environment, where the model learns biases based on the ways in which the data is non-representative
An automation system with actuators that performs intended tasks in the physical world, by means of sensing its environment and a software control system
Capability for computationally identifying and categorizing opinions expressed in a piece of text, speech or image, to determine a range of feeling such as from positive to negative
The underlying technological algorithm, method, or process that forms the technique for using or applying AI
Usage Note
This concept refers to the foundational computational implementation and is necessary to distinguish the 'algorithm' (ai:Technique) from the 'application' (ai:Capability) and 'goal' (dpv:Purpose)
DPV uses the following terms from [[RDF]] and [[RDFS]] with their defined meanings:
rdf:type to denote a concept is an instance of another concept
rdfs:Class to denote a concept is a Class or a category
rdfs:subClassOf to specify the concept is a subclass (subtype, sub-category, subset) of another concept
rdf:Property to denote a concept is a property or a relation
The following external concepts are re-used within DPV:
External
Future Work
Funding Acknowledgements
Funding Sponsors
The DPVCG was established as part of the SPECIAL H2020 Project, which received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 731601 from 2017 to 2019.
Harshvardhan J. Pandit was funded to work on DPV from 2020 to 2022 by the Irish Research Council's Government of Ireland Postdoctoral Fellowship Grant#GOIPD/2020/790.
The ADAPT SFI Centre for Digital Media Technology is funded by Science Foundation Ireland through the SFI Research Centres Programme and is co-funded under the European Regional Development Fund (ERDF) through Grant#13/RC/2106 (2018 to 2020) and Grant#13/RC/2106_P2 (2021 onwards).
Funding Acknowledgements for Contributors
The contributions of Delaram Golpayegani have received funding through the PROTECT ITN Project from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 813497, in particular through the development of AI Risk Ontology (AIRO) and Vocabulary of AI Risks (VAIR) which have been integrated in to this extension.
The contributions of Harshvardhan J. Pandit and Delaram Golpayegani have been made with the financial support of Science Foundation Ireland under Grant Agreement No. 13/RC/2106_P2 at the ADAPT SFI Research Centre.