LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators: what it means for business leaders

LionGuard 2 lets organisations moderate English, Mandarin, Malay, Tamil and colloquial slang in real-time on commodity CPUs, cutting GPU spend and compliance risk while delivering millisecond-level safety for chatbots, forums and live streams.

1. What the method is

LionGuard 2 is a multilingual content-safety stack that embeds each message with a frozen text-embedding-3-large model and feeds the vector into a one-megabyte, fourteen-head ordinal classifier tuned for Singapore harm categories. The tiny head runs entirely on CPUs, retrains overnight from new policy examples and delivers end-to-end moderation in about sixty milliseconds per request.

2. Why the method was developed

Off-the-shelf moderation APIs miss Singlish, code-mixed profanity and regional slurs, while full LLM guardrails demand GPUs and sensitive fine-tuning data. The authors built LionGuard 2 to prove that strong multilingual embeddings, localised taxonomies and lightweight heads can match heavyweight models, giving public agencies and startups an affordable, privacy-preserving alternative.

3. Who should care

Trust-and-safety leaders expanding into Southeast Asia, product managers adding multilingual generative AI, and CIOs of public institutions who must enforce uniform safeguards across hundreds of chat interfaces within fixed CPU budgets will all find LionGuard 2 immediately actionable.

4. How the method works

Text is cleaned, truncated and embedded once; the 3 072-dimensional vector passes through two shared dense layers before fourteen category-specific heads output ordinal severity scores. Thresholding maps scores to allow, transform or block actions that return to the caller. Embedding reuse means hybrid phrases like “walao eh you #%$*&” trigger correctly without language detection or translation.

5. How it was evaluated

The team tested on seventeen datasets, including RabakBench code-mix, MLCommons safety sets and a 6 000-item red-team corpus by native speakers. Baselines were commercial APIs, LlamaGuard-3-8B, ShieldGemma-2B and LionGuard 1. Metrics covered macro-F1, per-category F1, adversarial robustness and single-core CPU throughput.

6. How it performed

LionGuard 2 outperformed the best open-source guardrail by nine macro-F1 points and reduced Singapore hate-speech false negatives by 31 %. Running on one Xeon core, it moderated about 300 tokens per second and now powers GovTech’s real-time AI Guardian service. (Source: arXiv 2507.15339, 2025)

← Back to dossier index