Flag Protected Text with Azure AI Content Safety

In this blog post Flag Protected Text with Azure AI Content Safety we will walk through how you can automatically detect and flag sensitive or protected text using Microsoft Azure’s content safety services.

What is Azure AI Content Safety
Why flagging protected text matters
How Azure AI detects and flags protected text
Designing a “flag protected text” workflow
Setting up Azure AI Content Safety
- Step 1 – Create the Content Safety resource
- Step 2 – Choose SDK or REST
Example Detect and Flag Protected Text in Python
Implementing practical policies
Common pitfalls and how to avoid them
Where CloudProinc.com.au fits in
Getting started today

Flag Protected Text with Azure AI Content Safety is all about helping your teams move fast without accidentally exposing harmful, confidential, or non-compliant content. Instead of relying on manual review or brittle regex rules, you can plug in Azure AI and get consistent, explainable moderation decisions across your apps and workflows.

What is Azure AI Content Safety

Azure AI Content Safety is a managed service that uses large language models and classification models to analyse text, images, and other content for safety and compliance risks. Think of it as a specialised AI filter that looks at the meaning of a message, not just keywords.

For text, it can:

Detect harmful content (hate, violence, self-harm, sexual content)
Classify severity levels (e.g. allowed, flagged, blocked)
Identify personally identifiable information (PII) and protected attributes
Return machine-readable decisions you can plug into your own policies

Behind the scenes, Azure uses transformer-based language models, similar to those used in modern generative AI, combined with specialised safety classifiers. The service exposes this via simple REST and SDK APIs. You send content, it returns a safety analysis and scores. Your software then decides what to do: allow, flag, redact, or block.

Why flagging protected text matters

Most organisations are now shipping features that involve user-generated or AI-generated text: chat, support bots, document editors, AI copilots, and more. That text often flows across boundaries — between systems, teams, and even customers.

Flagging protected text is essential for:

Compliance – avoiding accidental exposure of PII, health data, or other regulated information.
Brand safety – keeping hate speech, harassment, or explicit content out of your platforms.
Security – spotting leaks of secrets or internal-only information.
Human safety – detecting content about self-harm or violence so you can intervene appropriately.

The challenge is to do this without over-blocking legitimate content or adding massive review overhead. Azure AI Content Safety helps you strike that balance by giving you precise category labels and risk scores.

How Azure AI detects and flags protected text

Under the hood, Azure AI Content Safety uses a pipeline of machine learning models:

1. Text understanding with large language models

First, the input text is tokenised and passed into a transformer-based language model. This model builds a high-dimensional representation that captures grammar, context, and semantics. This is what enables it to understand that:

“I’m going to kill this presentation” is not the same as “I’m going to kill him”.
“Cancer research dataset” is not the same as “diagnosis: lung cancer for John Smith”.

2. Safety classifiers on top

Specialised safety classifiers sit on top of that representation. These classifiers have been trained on curated datasets containing examples of hate, self-harm, explicit content, criminal activity, and more. They output a probability distribution across categories and severity levels.

For example, the model might label text as:

Hate – Category H1–H4 (from mild to severe)
Violence – V1–V4
Self-harm – S1–S4
Sexual content – X1–X4

Your application can then map those levels to policy actions such as log-only, flag, or block.

3. PII and protected attributes

On top of general safety, Azure provides detectors for sensitive information, including PII such as:

Names, emails, phone numbers
Government IDs, financial data
Health-related information

When you enable PII detection, the service can not only flag that PII is present, but also identify where in the text it occurs. That makes it easy to redact or mask sensitive parts before storing or sending the content on.

Designing a “flag protected text” workflow

Before writing any code, it helps to sketch the workflow. A typical pattern used by CloudProinc.com.au customers looks like this:

User or system submits text (e.g. chat message, document, AI-generated answer).
Your service calls Azure AI Content Safety with that text.
Azure returns category labels, severity, and PII detections.
Your policy engine decides how to handle it:
- If low risk & no PII – allow.
- If medium risk – allow but flag for audit or show a warning.
- If high risk – block, redact, or route to human review.
Log decisions for monitoring and tuning.

This separation of concerns is important. Azure AI Content Safety gives you an assessment, but you keep control over the final decision and user experience.

Setting up Azure AI Content Safety

To get started, you need an Azure subscription and a Content Safety resource.

Step 1 – Create the Content Safety resource

In the Azure portal, search for Content Safety.
Click Create.
Select your subscription, resource group, region, and pricing tier.
Once deployed, note the endpoint URL and key.

Step 2 – Choose SDK or REST

You can call Content Safety via REST or use an SDK (e.g. .NET, Python, JavaScript). Below is a Python example, which is concise and easy to experiment with.

Example Detect and Flag Protected Text in Python

The following snippet shows how to send a piece of text to Azure AI Content Safety, inspect the response, and implement a simple policy.

import os
from azure.ai.contentsafety import ContentSafetyClient
from azure.ai.contentsafety.models import (
    AnalyzeTextOptions,
    TextCategory
)
from azure.core.credentials import AzureKeyCredential

# 1. Configure client
endpoint = os.environ["AZURE_CONTENT_SAFETY_ENDPOINT"]
key = os.environ["AZURE_CONTENT_SAFETY_KEY"]

client = ContentSafetyClient(endpoint, AzureKeyCredential(key))

# 2. Text to analyse
text = "John Smith, Medicare number 1234-567-890, called me an idiot in chat."  

options = AnalyzeTextOptions(
    text=text,
    categories=[
        TextCategory.HATE,
        TextCategory.SELF_HARM,
        TextCategory.VIOLENCE,
        TextCategory.SEXUAL,
        TextCategory.PROFANITY
    ],
    # Optional: enable PII detection if available in your region
    # enable_pii_detection=True
)

result = client.analyze_text(options)

# 3. Inspect the response and flag

flags = []

for analysis in result.categories_analysis:
    cat = analysis.category
    severity = analysis.severity

    if severity >= 2:  # your threshold
        flags.append(f"Category {cat} at severity {severity}")

# PII entities (if enabled)
if getattr(result, "pii_entities", None):
    for entity in result.pii_entities:
        flags.append(f"PII: {entity.category}=>'{entity.text}'")

if flags:
    print("Content flagged:")
    for f in flags:
        print(" -", f)
else:
    print("Content allowed")

This example does three things:

Calls the Content Safety API with the text and categories you care about.
Loops over category analyses and checks severities against your threshold.
Optionally inspects PII entities to understand exactly what was detected.

Implementing practical policies

Flagging protected text is only useful if it feeds into practical, predictable policies. Here are some patterns that work well in production.

1. Multi-level decision model

Define three or four decision levels mapped from severity scores:

Allow – no categories triggered or only very low severity.
Allow + log – low to medium severity; keep an audit trail.
Flag for review – medium to high severity or PII present.
Block – high severity (e.g. explicit threats, extreme hate).

Implement this logic in a separate policy layer, so you can adjust thresholds without changing all your integrations.

2. Redacting instead of blocking

For PII or partial issues, full blocking is often overkill. A better approach is:

Redact only the sensitive spans (names, IDs, numbers).
Replace with placeholders such as [REDACTED_EMAIL].
Allow the rest of the content to flow through.

This keeps communication flowing while maintaining compliance.

3. Human-in-the-loop review

For higher-risk scenarios, use Azure AI to prioritise and route content to reviewers rather than auto-blocking everything. For example:

Queue only messages with severity >= 3 in any category.
Surface the model’s category and severity to help reviewers triage faster.
Feed reviewer decisions back into your policy tuning and analytics.

Common pitfalls and how to avoid them

Over-reliance on keywords

If you combine Azure AI with older keyword filters, you may end up double-blocking or triggering false positives. Prefer relying on the contextual classifier where possible, and use keyword rules only for very specific cases (such as internal codes).

Ignoring regional and domain context

Some language is benign in one industry or region but sensitive in another. Use different policies for different workloads. For example, a healthcare support chatbot may allow more medical detail than a public community forum.

Not monitoring outcomes

Always log decisions and review a sample regularly. Look for patterns:

Too many blocks for legitimate content? Loosen thresholds.
Incidents slipping through? Tighten or add review steps.

Where CloudProinc.com.au fits in

Integrating Azure AI Content Safety often touches multiple parts of your stack: front-end, APIs, data stores, and governance. CloudProinc.com.au typically helps customers by:

Designing end-to-end safety workflows tailored to their risk profile.
Implementing Azure AI Content Safety across chat, documents, and AI assistants.
Building policy engines that map model outputs to business rules.
Setting up monitoring and governance so compliance teams get the visibility they need.

The goal is to make safety and compliance a default feature of your digital products, not an afterthought that slows releases.

Getting started today

If you want to experiment with flagging protected text via Azure AI Content Safety, you can:

Spin up a Content Safety resource in Azure (often just minutes).
Use a simple script like the Python example above to score your existing content.
Prototype a basic allow/flag/block policy in a test environment.

From there, you can iterate quickly: refine thresholds, add PII redaction, and integrate into production workflows.

By combining Azure AI’s understanding of text with clear, well-tested policies, you can confidently flag protected content, reduce risk, and keep your platforms safe and productive for users.

Discover more from CPI Consulting -Specialist Azure Consultancy

Subscribe to get the latest posts sent to your email.