π₯ Firewall
The ai+me Firewall is a real-time security evaluation system that analyzes user requests to your AI applications and provides verdicts on their safety and appropriateness.
π― What is the ai+me Firewall?
The ai+me Firewall is an intelligent evaluation system that assesses user requests against your AI's defined scope and security policies. It acts as a security advisor that provides you with detailed verdicts, allowing you to make informed decisions about how to handle each request.
π Key Concepts
Real-time Evaluation
- Instant Analysis: Evaluates requests in milliseconds
- LLM-as-Judge: Uses the same evaluation technology as experiments
- Continuous Monitoring: 24/7 evaluation for your AI applications
- Zero Downtime: Seamless integration without service interruption
Developer Control
- Verdict Provision: Provides detailed assessments, not automatic blocking
- Decision Support: Gives you the information needed to make informed choices
- Flexible Response: Allows you to implement your own response strategies
- Context Awareness: Considers conversation context for better accuracy
ποΈ How the Firewall Works
The ai+me Firewall operates as an evaluation system that analyzes requests and provides detailed verdicts:
π Firewall Architecture
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β User Request β β Firewall β β Your β
β βββββΆβ Evaluation βββββΆβ Application β
β β’ User Prompt β β β β β
β β’ Context β β β’ LLM-as-Judge β β β’ Process β
β β’ Metadata β β β’ Safety Check β β Verdict β
β β β β’ Scope Analysisβ β β’ Send Response β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Verdict β
β Response β
β β
β β’ Status β
β β’ Fail Category β
β β’ Explanation β
β β’ Confidence β
βββββββββββββββββββ
π Setting Up Your Firewall
Step 1: Get Integration Details
Click the settings icon (gear icon) in the upper right corner of the Firewall page to access:
- API endpoint for firewall integration
- Authentication credentials (API keys)
- Configuration settings
Step 2: Configure Integration
API Endpoint Setup
The firewall provides a dedicated endpoint for your application:
https://api.aiandme.io/firewall/{project-id}
Authentication
Use your project's API key for authentication:
Authorization: Bearer your-project-api-key
Step 3: Integrate with Your Application
Python Integration
# Import necessary libraries
from aiandme import (
Firewall,
AIANDME_Firewall_CannotDecide,
AIANDME_Firewall_NotAuthorised,
)
from aiandme.schemas import Integration as IntegrationSchema
# Initialize the firewall with project credentials
fw = Firewall(
IntegrationSchema(
endpoint="https://api.aiandme.io/firewall/your-project-id",
api_key="your-project-api-key",
)
)
# Evaluate a user prompt
try:
# The user prompt that is being assessed
user_prompt = "Request content for AI analysis..."
# Optional: Previous agent prompt for context
agent_prompt = "The previous prompt from the agent that the user is responding to."
response = fw.eval(user_prompt, agent_prompt)
# You decide how to handle the verdict
if response.status:
# β
Safe prompt β Process with your AI
ai_response = your_ai_system.process(user_prompt)
return ai_response
else:
# π« Potentially unsafe prompt - you decide the response
if response.fail_category == "off_topic":
return "I'm sorry, but that's outside my area of expertise."
elif response.fail_category == "violation":
return "I cannot help with that request as it violates my guidelines."
elif response.fail_category == "restriction":
return "I'm not authorized to perform that action."
else:
return "I cannot process that request at this time."
except AIANDME_Firewall_CannotDecide:
# π€ Firewall uncertainβyou decide how to handle
return "I need to review your request. Please try again later."
except AIANDME_Firewall_NotAuthorised:
# β οΈ Authentication failed
print("Check your API credentials.")
except Exception as e:
# β Unexpected error
print(f"Firewall error: {e}")
return "I'm experiencing technical difficulties."
JavaScript/Node.js Integration
const axios = require("axios");
class AIandMeFirewall {
constructor(projectId, apiKey) {
this.endpoint = `https://api.aiandme.io/firewall/${projectId}`;
this.apiKey = apiKey;
}
async evaluateRequest(userPrompt, agentPrompt = null) {
try {
const response = await axios.post(
this.endpoint,
{
prompt: userPrompt,
agent_prompt: agentPrompt,
},
{
headers: {
Authorization: `Bearer ${this.apiKey}`,
"Content-Type": "application/json",
},
},
);
return response.data;
} catch (error) {
throw new Error(`Firewall evaluation failed: ${error.message}`);
}
}
}
// Usage
const firewall = new AIandMeFirewall("your-project-id", "your-api-key");
async function processUserRequest(userPrompt) {
try {
const verdict = await firewall.evaluateRequest(userPrompt);
// You decide how to handle the verdict
if (verdict.status) {
// β
Safe request - process with your AI
return await yourAISystem.process(userPrompt);
} else {
// π« Potentially unsafe request - you choose the response
switch (verdict.fail_category) {
case "off_topic":
return "That's outside my area of expertise.";
case "violation":
return "I cannot help with that request.";
case "restriction":
return "I'm not authorized for that action.";
default:
return "I cannot process that request.";
}
}
} catch (error) {
console.error("Firewall error:", error);
return "I'm experiencing technical difficulties.";
}
}
cURL Example
curl -X POST "https://api.aiandme.io/firewall/your-project-id" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"prompt": "User request here",
"agent_prompt": "Previous agent response (optional)"
}'
π Understanding Firewall Verdicts
Verdict Categories
β Safe Requests (status: true)
- Definition: Requests that pass all security evaluations
- Action: You can safely process with your AI system
- Criteria: Aligns with business scope and allowed intents
π« Potentially Unsafe Requests (status: false)
Off-Topic Requests (fail_category: "off_topic")
- Definition: Requests outside your AI's defined scope
- Example: Asking a customer support bot about medical advice
- Your Decision: How to respond to out-of-scope requests
- Risk Level: Low - typically user confusion
Intent Violations (fail_category: "violation")
- Definition: Requests that violate your AI's allowed intents
- Example: Asking for unauthorized access to user data
- Your Decision: How to handle policy violations
- Risk Level: Medium - potential security concern
Restricted Actions (fail_category: "restriction")
- Definition: Requests for actions your AI is not authorized to perform
- Example: Attempting to modify system settings or access restricted data
- Your Decision: How to respond to unauthorized requests
- Risk Level: High - security violation
π€ Uncertain Requests (AIANDME_Firewall_CannotDecide)
- Definition: Requests where the firewall cannot make a clear determination
- Your Decision: Whether to process, block, or flag for manual review
- Action: Implement your own uncertainty handling strategy