Datenklass ifizierung - Grundlage jeder Datenschutzstrategie
Data classification categorizes information based on its protection requirements: public, internal, confidential, strictly confidential (or subject to confidentiality requirements). Classification systems form the basis for DLP, access rights, encryption, and retention periods. Microsoft Purview Information Protection uses labels and automatic classification (trainable classifiers, pattern matching). BSI Basic Protection: Protection requirements 'normal', 'high', 'very high'. ISO 27001: Annex A-8.2.
Data classification is the first step in any serious data protection strategy: only by knowing which data is sensitive and to what degree can it be adequately protected.
Classification Levels
Typical corporate classification levels:
Level 1: Public:
→ Freely accessible information
→ Examples: marketing materials, press releases, public website
→ Damage in case of loss: none (already public)
→ Measures: none required
Level 2: Internal:
→ For employees only
→ Examples: internal memos, general process documentation
→ Damage in case of loss: low (no competitive advantage)
→ Measures: no public upload, must remain internal
Level 3: Confidential:
→ Restricted group of individuals
→ Examples: Customer contracts, financial reports, HR data
→ Damage in case of loss: significant (GDPR, competition, reputation)
→ Measures: Encryption, access log, no transmission without TLS
Level 4: Strictly Confidential:
→ Smallest possible group
→ Examples: M&A documents; financial statements prior to publication,
C-suite personnel files, security audit reports
→ Damage in case of loss: threatens business survival or is criminally relevant
→ Measures: MFA access, DLP, no cloud storage, physical backup
BSI Protection Needs Analysis (3-level):
Normal: Normal protection needs (standard measures)
High: Increased protection needs (increase security measures)
Very High: Maximum protection needs (BSI: "life-threatening, very serious damage")
Classification Schemes
Various industry standards:
Government classification (NATO/German authorities):
Unclassified / VS-For Official Use Only (NfD) / VS-Confidential / Secret / Top Secret
→ Regulated by law (Classified Information Directive)
→ Relevant for KRITIS operators (cooperation with authorities)
GDPR relevance:
→ Not directly a classification system, but implied:
→ Personal data: Confidential (min.)
→ Special categories (Art. 9 GDPR): Strictly Confidential
(Health, biometrics, religion, political opinion, etc.)
→ Anonymized data: Public or Internal
PCI DSS (Payment Cards):
→ Cardholder Data: highest level of protection (always!)
→ Sensitive Authentication Data: delete after transaction
ISO 27001 Annex A-8.2:
→ "Information Classification" as a requirement
→ Document classification policy in the ISMS
→ All assets classified (in asset inventory!)
→ Labeling: How are documents marked?
Microsoft Purview Information Protection
Automatic classification with labels:
Sensitivity Labels:
→ Configurable in M365 (Purview Compliance Center)
→ Labels: Public / Internal / Confidential / Strictly Confidential
→ Visible: in Office apps (Word, Excel, Teams, Outlook)
Label Actions:
Encryption:
→ "Confidential" label → AES-256 encrypted (AIP)
→ Only authorized users can open
→ Even if the file is shared: encryption remains!
Watermark:
→ "CONFIDENTIAL" as a watermark in header/footer
→ Deterrent + visibility when printed/screenshot
DLP trigger:
→ "Strictly Confidential" label → DLP policy active
→ External sending blocked
Expiration date:
→ M&A documents: automatically set to "Internal" after 90 days (instead of "Strictly Confidential")
Automatic Classification:
# Trainable Classifiers (ML):
→ Upload sample documents → ML learns category
→ "Financial Reports": 50 examples → ML classifies similar ones
# Pattern-based (Sensitive Info Types):
→ IBAN: automatically "Confidential"
→ Passport numbers: automatically "Confidential"
→ Internal project numbers (Regex): automatically "Internal"
Auto-Labeling Policy:
→ Documents without labels: automatically apply
→ Simulation mode: first see what would be classified
→ Then activate: all matching files are labeled
Classification in Practice
Challenges and Solution Approaches:
Challenge 1: User Resistance
→ "Too time-consuming"
→ Solution: Default label "Internal" (no extra click required)
→ Only upgrading the classification requires a conscious action
Challenge 2: Inconsistency
→ Same document, different labels depending on the creator
→ Solution: Clear policy + decision tree
"Does it contain customer data? → Yes → Confidential"
Obstacle 3: Legacy documents
→ 100,000 old documents without labels
→ Solution: Bulk auto-labeling (Purview) + manual spot checks
→ Don’t do everything at once: proceed with prioritization
Hurdle 4: Cloud vs. On-Premises
→ Labels in M365 are fine, but what about on-prem file servers?
→ Solution: Azure Information Protection Scanner (on-prem files)
→ Or: SharePoint/OneDrive as the target platform
Audit evidence (ISO 27001 A-8.2):
→ Classification policy: documented in writing
→ Asset inventory: all data assets with classification
→ Evidence: Purview Labels report (which documents, which label)
→ Training: Employees are familiar with classification levels
(Evidence: Training log)