Moderation that is fast, multilingual, and explains itself.
95% of traffic resolved in under 10ms by a cheap classifier. Ambiguous cases get escalated to a local LLM for nuanced judgment. Scores, reasoning, and model version on every call.
REQS/S
847
P95
8ms
BLOCKED
2.4%
<10ms
P95 Latency
Tier 1
7
Categories
Scored
4
Inference Tiers
Cascading
99.9%
Uptime SLA
Monthly
Four Tiers. One API Call.
95% of traffic never touches the expensive model. Only genuinely ambiguous cases get escalated — giving you Hive-level accuracy at 1/10th the compute cost.
Deterministic Shortcut
<1ms · ~60% of traffic
Unicode normalization, leet-speak expansion, emoji mapping, hard blocklist, and response caching. Bypasses GPU entirely.
Fast Classifier
5–15ms · ~35% of traffic
multilingual-toxic-xlm-roberta, INT8 quantized on ONNX Runtime. 7 sigmoid heads — hate, harassment, sexual, self-harm, violence, spam, threat. 8 languages native.
Main Model
15–30ms · ~4% of traffic
Multilingual E5-small + multi-head classifier. Context-aware — handles "kill this level" vs "kill you".
LLM Judge
100–250ms · ~1% of traffic
Llama 3.1 8B INT4 via vLLM. Structured JSON output with grammar-constrained decoding. Handles conversation context for grooming detection.
Built Different. Priced Fair.
Six capabilities that separate HushSafe from every incumbent — with transparent pricing on each.
Conversation Context
Send last N messages for grooming escalation and evolving harassment detection. Hive can't do this.
Custom Vocabulary
Upload platform-specific phrases — gaming terms, clinical language — as embeddings injected into inference. 30-minute turnaround.
Per-Customer Fine-Tuning
Your feedback trains a LoRA adapter (~4MB) hot-swapped at inference. Your content, your calibration, same API.
Explanation-Grade Output
Every decision includes machine-readable reason + plain-English explanation. Critical for support, appeals, and compliance.
Data-Residency Toggle
Single API flag: "region": "eu" | "us" | "in". Routes to that region's model. Data never leaves that geography.
Open Eval Harness
Published accuracy benchmarks on public leaderboard. Monthly re-runs on golden set + HateCheck. Hive never publishes numbers.
{
"request_id": "req_01HX...",
"decision": "BLOCK",
"scores": {
"hate": 0.12,
"harassment": 0.78,
"sexual": 0.03,
"self_harm": 0.02,
"violence": 0.08,
"spam": 0.01,
"threat": 0.44
},
"primary_category": "harassment",
"confidence": 0.78,
"tier_reached": 2,
"model_version": "tier2-v4.2.1",
"explanation": "High confidence for harassment category. Pattern consistent with targeted personal attack.",
"latency_ms": 18,
"billable": true
}Every decision explains itself.
Explanation Field
Plain-English reasoning for every decision. Template-based at Tier 2, LLM-generated at Tier 3. Worth 2x the price for any team answering "why was my message blocked?" tickets.
Full Transparency
Scores for all 7 categories, the exact model version, tier reached, and latency — all in one response. Hive returns only numbers. You return numbers + reasoning.
Pay for what you use. Nothing more.
60% of traffic never touches GPU. That cost structure is your savings.
Be the first to moderate smarter.
Join 47 teams on the design partner waitlist. Early access, co-shaped pricing.
Gaming platform
12M messages/day, currently on Hive
Need: Cost reduction + subtle toxicity catch rate
Mental health app
Needs self-harm detection in 6 languages
Need: Multilingual + nuanced context understanding
B2B SaaS
EU data residency + compliance audit trail
Need: GDPR-compliant, per-request reasoning output
The architecture is the advantage.
10× cheaper than Hive
60% of traffic bypasses GPU entirely. 35% hits the cheapest model. Your cost floor is $0.00004 average per request.
Multilingual by default
Multilingual E5-small handles 8+ languages natively. No per-language model required — one pipeline for global content.
Explains every decision
Scores, reasoning, model version, and latency on every call. Your support team and compliance auditors will thank you.
Improves with your feedback
Customer feedback loop auto-trains LoRA adapters weekly. Your data, your calibration — deployed via canary rollout.
Start moderating in 5 minutes.
One API endpoint. Seven category scores. Plain-English explanations. Free to start.