Customize Feedback Behavior

This is in Public Beta for Enterprise plans.

Protege Policy Feedback helps streamline policy compliance by identifying potentially risky content and allowing you to take actions. Here’s how to manage feedback cards and train your AI model for greater precision:

AI-Powered Feedback Card Actions

When a policy flags text that your organization deems not risky or out-of-scope, you have eight primary actions:

Resolve (Dismiss feedback)

Removes this specific feedback instance and prevents the exact same text from being flagged again on this page or document. Use this when the flagged content is acceptable for a specific context or time period.

If "free AirPods with every purchase" appears in ad copy during a legitimate limited-time promotion, resolving it acknowledges the content is appropriate for now.

👍 Thumbs Up

Retrains the model to continue flagging this type of content more consistently. Use when the AI is under-flagging similar items and you want it to learn to catch them reliably.

If the model inconsistently identifies free item language, giving a thumbs up to correctly flagged examples helps improve the model's detection accuracy.

🔘 Acknowledge

Tags you as the user to this feedback card while keeping the item visible in the Content Scanner until you or another user clicks Resolve. Use this when additional review or confirmation is needed before taking action.

If a reviewer needs to verify details like the expiration date of a promotion stating "free AirPods with every purchase until 7/31," you can acknowledge it and return later with complete information.

🚫 Ignore Phrasing for Everything

Retrains the model to ignore this specific phrasing for this policy across the entire environment through fine-tuning. Use when the flagged content is clearly not a violation of the intended policy.

If "low cost platform" was flagged under a free items policy, this action teaches the model that such phrasing doesn't constitute free item language.

🚫 Ignore Phrasing for Brand

Ignores all instances of this exact phrasing for this policy across any of the brand's marketing content using text-search filtering. Use for permanent brand-specific language that should never be flagged.

If a financial services brand offers "free wire transfers" as a standard service feature, this action prevents future flags of that exact phrase.

🚫 Ignore Policy for Brand

Suppresses all future flags from this policy for the specified brand only. Use when an entire policy doesn't apply to a particular brand's business model.

If your Protege environment manages multiple brands and one brand's entire product offering is free, this action disables the policy for that brand specifically. If you have a single brand in Protege, you would disable the Policy in the Policy Engine instead.

🚫 Ignore Policy for Page

Suppresses all future flags from this policy for this specific page or document only.

Use when a policy should never run on certain page types again. This is commonly applied to Terms of Service and Privacy Policy pages where policy flags are typically not relevant to the content's purpose.

🚫 Wrong Policy

Indicates that the flagged text actually falls under a different policy category. Retrains the model to learn proper policy-to-text mappings and improve classification accuracy.

If the free items policy flagged "highest payout sports betting platform in the USA," this action would reclassify it under unqualified claims policy instead, since it's not about free offerings but rather unsubstantiated comparative claims.

Below is a visual example showing where each action button appears on the Feedback Card. The labels correspond to the table above:

Add New (Manual Missing feedback)

Allows users to explicitly enter any missing feedback—non-flagged examples—to Teach the AI and improve its detection accuracy.

Teach your AI Admin Approvals

By default, every Teach your AI submission enters a pending state and must be manually approved through the Policy Engine before being used for model training. This creates a human-in-the-loop validation step that maintains training data quality.

Approval Workflow

  1. Submission: When users submit a Teach your AI example, it is automatically queued for review

  2. Review: Designated approvers can access pending submissions in the Policy Engine

  3. Decision: Approvers can take one of three actions:

    • Approve: Accept the submission for model training

    • Edit: Modify the submission before approval

    • Delete: Remove the submission entirely

  4. Training: Only approved submissions are included in the model finetuning process

Managing Submissions

Editing Submissions

  • Click on any pending submission to open the editor

  • Modify the instruction text, expected output, or formatting

  • Save changes to update the submission while maintaining its place in the approval queue

Deleting Submissions

  • Select submissions that are inappropriate, duplicate, or no longer needed

  • Deleted submissions are permanently removed and will not be used for training

Auto-Approval Feature

3-Day Auto-Approval

For teams that need faster iteration cycles, you can enable automatic approval after a 3-day waiting period:

  1. Navigate to Policy Engine settings

  2. Enable "Auto-approve Teach your AI after 3 days"

  3. Configure notification preferences for pending approvals

How Auto-Approval Works:

  • Submissions remain in pending status for 3 days

  • If no manual action is taken within this timeframe, they are automatically approved

  • Manual approval or rejection can still occur at any point during the 3-day window

  • Auto-approved submissions are logged for audit purposes

Key Notes

  • Fine-tuning Beta: The “Thumbs Up” (👍), Wrong Policy, and Ignore Phrasing for Everything workflows rely on large language model fine-tuning.

  • Audit Trail: Every action (Resolve, Acknowledge, Wrong Policy, Thumbs Up, Ignore Phrasing/Policy) is recorded in the review history for compliance auditing.

Last updated

Was this helpful?