docs
Taxonomy authoring
Your harms taxonomy tells Backstop what to alert on. It is a small YAML document that Backstop turns into a system prompt for the vision-capable LLM.
Start with a preset
The parent web app ships with three presets:
- Standard 6-8 — conservative rules for younger children.
- Standard 8-12 — the default.
- Standard 13-17 — more nuanced, treats a lot of content as “context, not alert.”
Pick one, then edit.
Structure
categories:
- id: bullying
label: Bullying or harassment
severity_bucket: 1
description: |
Messages that appear to threaten, demean, or exclude a specific person
by name or handle, sent to or received by my child.
examples_positive:
- 'message calling someone a slur'
- 'coordinating exclusion in a group chat'
examples_negative:
- 'sarcastic banter between mutual friends'
- id: self_harm
label: Self-harm ideation
severity_bucket: 1
description: |
Content that suggests my child or someone they're talking to is thinking
about hurting themselves.
- id: adult_content
label: Sexual content
severity_bucket: 1
description: |
Explicit sexual imagery or text on screen.
Fields
id— slug used in the alert record. Stable; don’t rename after alerts have fired.label— the human-readable name that appears in your alerts.severity_bucket—0for “notify eventually” or1for “notify now.” See below.description— the rule. Write it as a plain-English sentence. This is what the LLM matches against.examples_positive/examples_negative— optional. Anchor the LLM’s judgment on your context.
Severity buckets
Backstop deliberately supports only two buckets so the control plane can route alerts without seeing content:
0— delivered to your default channel (usually the parent web app), included in daily digests.1— delivered immediately to your urgent channel (push, SMS).
Writing good descriptions
- Be concrete about your family. “My child is 11 and plays Minecraft” is real context; the LLM will use it.
- Use examples if the category is fuzzy. Two
examples_positiveand twoexamples_negativebeat a paragraph of hedging. - Prefer inclusion over exclusion. “Flag anything that looks like X” works better than “flag everything except Y.”
Testing
The parent app has a Taxonomy tester. Paste text or upload a sample screenshot, and it runs your current taxonomy locally through your BYOK LLM. Iterate here before pushing to endpoints.
Publishing
Save changes in the parent app. The app encrypts the taxonomy under your family key and sends the ciphertext to the control plane, which relays it to your enrolled endpoints on their next config sync (within a few minutes).