Skip to content

Services, Wiki-Artikel und Blog-Beiträge durchsuchen

↑↓NavigierenEnterÖffnenESCSchließen

ML Model Security

Your ML model is making
the wrong decisions.

Fraud detection systems, credit scoring models, medical diagnostics - they are all attackable. A crafted image fools your classifier. A poisoned data point corrupts training. We find the vulnerabilities before attackers exploit them.

OWASP ML Top 10 MITRE ATLAS GDPR Risk Assessment EU AI Act Art. 15
OWASP ML TOP 10 - ATTACK VECTORS
ML01 Adversarial Input (Evasion) critical
ML02 Data Poisoning critical
ML03 Model Inversion high
ML04 Membership Inference high
ML05 Model Theft / Extraction high
ML06 AI Supply Chain Attack high
ML07 Transfer Learning Backdoor critical

+ ML08 Model Skewing · ML09 Output Integrity · ML10 Model Poisoning

OWASP ML categories covered
10
Fixed-price offer
from EUR 15,000
Offer within
48h (business days)
Subcontractors
0

The Problem

ML models are systematically fooled

Classical ML systems have an attack surface that conventional penetration tests do not capture: they are statistically attackable - not through code exploits, but through targeted manipulation of input and output data. For production systems in regulated industries, this is not an academic problem.

Finance: Fraud goes undetected

Adversarial attacks on fraud detection systems allow attackers to slip fraudulent transactions past detection - with minimal adjustments to transaction features.

Healthcare: Diagnostics manipulated

Adversarial examples on medical imaging systems can cause a tumor to go undetected or a misdiagnosis to be made - without any visible image manipulation.

GDPR: Personal data extractable from the model

Model inversion and membership inference threaten data protection compliance: conclusions about training data can be drawn from your model - even without access to the original data.

ATTACK EXAMPLE - EVASION ATTACK

INPUT › Transaction: 2,847 EUR, merchant: DE4829… [normal]
MODIFY › Δ time: +0.003h · Δ amount: -0.12 EUR · Δ cat: +1
RESULT › Fraud score: 0.08 - classified as LEGITIMATE ✓
IMPACT › Fraudulent transaction carried out undetected

ATTACK EXAMPLE - MEMBERSHIP INFERENCE

QUERY › Scoring API: requests for 1,000 data points
ANALYSE › Confidence distribution analyzed → overfitting signal
RESULT › 87% accuracy: "Person X was in training" - GDPR

What We Test

Six Attack Classes Against ML Models

Every test covers all OWASP ML Top 10 categories - with verified proof-of-concept exploits for your specific model.

01 critical

Adversarial Examples & Evasion Attacks

Minimal manipulations of input data - invisible to humans - that force the model into misclassification. White-box attacks (FGSM, PGD, C&W) using gradient descent and black-box attacks through transfer and query methods. Particularly critical for fraud detection, image classification, and quality control.

OWASP ML01FGSM · PGD · C&W
02 critical

Data Poisoning

Manipulation of the training dataset to plant backdoors or systematically degrade model quality. Particularly critical for continuously retrained systems (online learning, feedback loops). We analyze your data ingestion pipeline and training processes for poisoning vectors.

OWASP ML02Backdoor · Clean-Label
03 high

Model Inversion

Reconstruction of training data through systematic API queries. Particularly relevant for models trained on personal data. We quantify how precisely input features can be inferred from output information - direct GDPR risk assessment.

OWASP ML03GDPR Risk
04 high

Membership Inference

Statistical attacks to determine whether a data point was used in training. Confidence-based and shadow-model-based attack methods. We measure the attack success rate and determine information leakage per GDPR Article 5 (purpose limitation, data minimization).

OWASP ML04Art. 5 GDPR
05 high

Model Theft & Extraction

Theft of model weights or behavior through systematic querying of the inference API. We measure how many queries are needed for accurate extraction, and test your API protections: rate limiting, output perturbation, query pattern detection.

OWASP ML05IP Protection
06 critical

Transfer Learning & Supply Chain Backdoors

Auditing pre-trained models from public sources (Hugging Face, TensorFlow Hub, PyPI) for known and novel backdoor signatures. Analysis of the training supply chain: which third-party datasets were used? Are they trustworthy and auditable?

OWASP ML07MITRE ATLAS AML.T0010

Industries

Who needs ML model security most urgently?

Wherever ML models automatically make decisions with consequences for people or organizations - security resilience is not an option, it is a requirement.

Finance

  • Fraud detection bypass
  • Credit scoring manipulation
  • AML model evasion
  • Insider trading detection circumvented

DORA · Financial regulation · MaRisk

Healthcare

  • Diagnostic model deception
  • Patient data inversion
  • Medication dosage errors
  • Anomaly detection defeated

EU AI Act High-Risk · MDR

Insurance

  • Underwriting model evasion
  • Claims processing manipulated
  • Risk models poisoned
  • Membership inference on clients

GDPR · Solvency II

Industry & Quality Control

  • Image classification fooled
  • Defects left undetected
  • Process control manipulated
  • Predictive maintenance disrupted

NIS-2 · IEC 62443

Methodology

How an ML Security Assessment Works

Systematic attack simulation per OWASP ML Top 10 and MITRE ATLAS - combined with GDPR risk assessment.

01

2-3 days

Scoping & Threat Modeling

Identification of all ML components, data flows, and dependencies. Threat modeling per MITRE ATLAS (ML-specific tactics). Assessment of the regulatory framework: EU AI Act risk class, GDPR processing basis, industry-specific requirements. Definition of test scope and rules of engagement.

02

2-4 days

Model Analysis & Reconnaissance

Architecture analysis: model type, framework (scikit-learn, PyTorch, TensorFlow), training history, feature engineering. API endpoint mapping: what inputs are accepted? How precise are the outputs? Training supply chain analysis: data sources, frameworks, pre-trained models. Attack surface identification.

03

5-8 days

Adversarial Testing

White-box attacks (with model access): gradient-based methods (FGSM, PGD, Carlini & Wagner), backward pass differentiable approximation. Black-box attacks (API only): transfer-based methods, zeroth-order optimization, square attack. Tabular data: feature manipulation, constraint-based evasion for fraud and scoring systems.

04

3-5 days

Privacy Attack Analysis

Model inversion: reconstruction of input features from outputs. Membership inference: confidence ratio attacks, shadow model method, LiRA attack. Attribute inference: can unsubmitted features be inferred? Quantitative GDPR risk calculation: information leakage in bits, precision/recall of attacks.

05

2-4 days

Supply Chain & Poisoning Analysis

Audit of all pre-trained models and datasets used. Backdoor detection with neural cleanse methods (NC, STRIP, ABS). Testing the data ingestion pipeline for poisoning vectors. CI/CD analysis: are training pipelines protected from unauthorized manipulation?

06

2-4 days

Reporting & Remediation

Technical report with OWASP ML mapping, MITRE ATLAS references, and CVSS v4 scoring. GDPR risk section: quantified information leakage and recommendations. EU AI Act compliance evidence for Art. 15 (robustness, cybersecurity). Prioritized remediation roadmap: defense-in-depth strategy (adversarial training, differential privacy, monitoring).

Typical total duration: 15-25 days - depending on model complexity, data access, and test depth.
You receive a binding fixed-price offer within 48 hours (business days) from EUR 15,000.

Compliance & Regulation

One assessment - all compliance evidence

Every finding is mapped to relevant standards and regulations. Your report is audit-ready.

OWASP ML Top 10

Systematic testing of all ten vulnerability categories for ML systems - the de facto standard for ML security assessments worldwide.

ML01-ML10 · fully covered

MITRE ATLAS

Threat modeling using the AI-specific ATT&CK equivalent: tactics and techniques of real attacks on ML systems as the basis for test planning.

Tactics · Techniques · Procedures

EU AI Act - Art. 15

Evidence of robustness against adversarial attacks, data poisoning, and model manipulation for high-risk AI systems per Article 15.

High-risk AI · GPAI since Aug. 2025

GDPR - Art. 5 & 25

Quantified evidence of information leakage through model inversion and membership inference. Technical measures per privacy by design (Art. 25).

Privacy by Design · Risk Report

ISO/IEC 42001

Technical evidence for the operational AI security controls of the AI management system standard - foundation for ISO 42001 certification.

38 controls · 9 objective categories

NIST AI RMF

Mapping to the core functions Govern, Map, Measure, Manage. Particularly relevant for the AI RMF Adversarial ML Profile (NIST AML).

NIST AML · GenAI Profile (2024)

Warum AWARE7

Was uns von anderen Anbietern unterscheidet

Reine Awareness-Plattformen testen keine Systeme. Reine Beratungskonzerne sind zu weit weg. AWARE7 verbindet beides: Wir hacken Ihre Infrastruktur und schulen Ihre Mitarbeiter - mittelstandsgerecht, persönlich, ohne Enterprise-Overhead.

Forschung und Lehre als Fundament

Rund 20% unseres Umsatzes stammen aus Forschungsprojekten für BSI und BMBF. Unsere Studien analysieren Millionen von Websites und Zehntausende Phishing-E-Mails - publiziert auf ACM- und Springer-Konferenzen. Drei unserer Führungskräfte sind gleichzeitig Professoren an deutschen Hochschulen.

Digitale Souveränität - keine Kompromisse

Alle Daten werden ausschließlich in Deutschland gespeichert und verarbeitet - ohne US-Cloud-Anbieter. Keine Freelancer, keine Subunternehmer in der Wertschöpfung. Alle Mitarbeiter sind sozialversicherungspflichtig angestellt und einheitlich rechtlich verpflichtet. Auf Anfrage VS-NfD-konform.

Festpreis in 24h - planbare Projektzeiträume

Innerhalb von 24 Stunden erhalten Sie ein verbindliches Festpreisangebot - kein Stundensatz-Risiko, keine Nachforderungen, keine Überraschungen. Durch eingespieltes Team und standardisierte Prozesse erhalten Sie einen klaren Zeitplan mit definiertem Starttermin und Endtermin.

Ihr fester Ansprechpartner - jederzeit erreichbar

Ein persönlicher Projektleiter begleitet Sie vom Erstgespräch bis zum Re-Test. Sie buchen Termine direkt bei Ihrem Ansprechpartner - keine Ticket-Systeme, kein Callcenter, kein Wechsel zwischen wechselnden Beratern. Kontinuität schafft Vertrauen.

Für wen sind wir der richtige Partner?

Mittelstand mit 50–2.000 MA

Unternehmen, die echte Security brauchen - ohne einen DAX-Konzern-Dienstleister zu bezahlen. Festpreis, klarer Scope, ein Ansprechpartner.

IT-Verantwortliche & CISOs

Die intern überzeugend argumentieren müssen - und dafür einen Bericht mit Vorstandssprache brauchen, nicht nur technische Findings.

Regulierte Branchen

KRITIS, Gesundheitswesen, Finanzdienstleister: NIS-2, ISO 27001, DORA - wir kennen die Anforderungen und liefern Nachweise, die Auditoren akzeptieren.

Mitwirkung an Industriestandards

LLM

OWASP · 2023

OWASP Top 10 for Large Language Models

Prof. Dr. Matteo Große-Kampmann als Contributor im Core-Team des international anerkannten OWASP LLM-Sicherheitsstandards.

BSI

BSI · Allianz für Cyber-Sicherheit

Management von Cyber-Risiken

Prof. Dr. Matteo Große-Kampmann als Mitwirkender des offiziellen BSI-Handbuchs für die Unternehmensleitung (dt. Version).

Frequently Asked Questions about ML Model Security

Everything you should know about adversarial attacks, data poisoning, and GDPR risks in ML systems.

Data poisoning (OWASP ML03 / MITRE ATLAS AML.T0020) is an attack on the training phase of an ML model. An attacker introduces manipulated data points into the training dataset to systematically corrupt the model. The consequences: the model makes deliberately wrong decisions for specific inputs (backdoor attack), generates overall degraded predictions (denial-of-service against model quality), or has been conditioned to always misclassify a specific trigger input. Particularly critical for models that are continuously retrained - e.g., fraud detection systems that process new transaction data daily.
Adversarial examples are inputs that look identical to legitimate inputs to a human but force the model into a wrong classification. In an image classifier, selectively shifting specific pixels by a minimal amount - invisible to the human eye - causes the model to suddenly perceive the object as something else. In tabular data (fraud detection, credit scoring), a few numerical features are minimally adjusted so that a fraudulent transaction passes as legitimate. Evasion attacks can be generated in the white-box setting (attacker knows the model) using gradient descent, or in the black-box setting (API access only) through transfer-based methods and query optimization.
In a model inversion attack, an attacker reconstructs training data from an ML model - without direct access to the original data. Through systematic queries and analysis of model outputs, feature values of individual training data points can be approximately reconstructed. In healthcare this means: sensitive patient data can be reconstructed from a trained diagnostic model. In finance: conclusions about account data can be drawn from a scoring model. The GDPR relevance is direct: if personal data can be extracted from your model, this constitutes a data protection violation - even if the raw data is securely stored. Our tests verify whether your model is susceptible to inversion and what data could potentially be reconstructed.
Membership inference attacks answer the question: "Was person X in the training data of this model?" - with accuracy well above chance level. An attacker observes how a model responds to certain inputs and infers whether that data point was used in training. This is a direct GDPR issue because it reveals the processing of personal data, even if these are never directly accessible. In regulated industries - healthcare, insurance, HR - membership in the training group alone is protected information. A data subject could thus discover whether their data was processed without consent.
The OWASP Machine Learning Security Top 10 is a community standard for the most critical security risks in classical ML systems - analogous to the OWASP Top 10 for web applications. The ten categories are: ML01 Input Manipulation Attack (Adversarial Examples), ML02 Data Poisoning Attack, ML03 Model Inversion Attack, ML04 Membership Inference Attack, ML05 Model Theft, ML06 AI Supply Chain Attacks, ML07 Transfer Learning Attack, ML08 Model Skewing, ML09 Output Integrity Attack, and ML10 Model Poisoning. We use this standard as a systematic foundation for all ML security assessments and supplement it with the MITRE ATLAS framework for threat modeling.
Model extraction (OWASP ML05) refers to theft of an ML model through systematic API querying. An attacker sends thousands of carefully selected inputs and analyzes the outputs - using this to train a surrogate model that nearly perfectly mimics the original. Attacker motivation: your model is intellectual property with significant competitive value. A stolen fraud detection model also allows attackers to locally generate adversarial examples that are more precise in the white-box setting. Countermeasures include: differential privacy in training, output perturbation, rate limiting, and query pattern monitoring - we test how resilient your API is against extraction.
Transfer learning is today's standard: organizations use pre-trained base models (ImageNet, BERT, GPT) and fine-tune them on their own data. An attacker can plant a backdoor in the pre-trained phase - hidden logic that is only activated with a specific trigger input. The resulting fine-tuned model behaves correctly for all normal inputs, but for the special trigger input always returns the attacker's desired output. Particularly dangerous: the backdoor typically survives fine-tuning and is not detectable in normal validation routines. We test your pre-trained models from public sources for known and novel backdoor signatures.
We test all common ML system types: classical supervised learning models (random forests, gradient boosting, SVMs, neural networks) for fraud detection, credit scoring, churn prediction, and quality control. Computer vision models (CNN, ViT) for medical imaging, industrial inspection, and OCR. NLP models for sentiment analysis, document classification, and named entity recognition. Anomaly detection and time series models (LSTM, Prophet) for industrial process monitoring. Reinforcement learning systems for pricing and resource optimization. Both cloud-hosted (SageMaker, Vertex AI, Azure ML) and on-premise deployments.
An ML security assessment is more specialized than a classic penetration test and requires deep expertise in statistics, ML algorithms, and attack methodology. A focused assessment of a single ML model (e.g., your fraud detection system) starts from EUR 15,000. A comprehensive assessment of multiple models including pipeline testing and GDPR risk evaluation is EUR 25,000-45,000. You receive a binding fixed-price offer within 48 hours (business days). No hourly rates, no additional charges.

How resilient is your ML model against targeted attacks?

Our experts test your fraud detection system, scoring model, or AI diagnostics against all OWASP ML Top 10 attacks - with a fixed-price commitment and GDPR risk assessment.

Kostenlos · 30 Minuten · Unverbindlich