Skip to main content
All CollectionsLLMAsAService.io
Google Safety/Harm Settings
Google Safety/Harm Settings
Updated over 2 months ago

Google have built-in response Safety features. These can be adjusted by adding a specific section to the LLM Service body definition for Google vendor services defined in LLMAsAService.io.

How to configure in a service

For the request body template, add the following (setting the levels using the values shown below). In this case, it lessens the policy to trigger only on high thresholds of detection -

{
"safetySettings" : [
{
"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_ONLY_HIGH"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_ONLY_HIGH"
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_ONLY_HIGH"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_ONLY_HIGH"
}
]
}

Categories

The different categories for response analysis are given the following keys -

  • HARM_CATEGORY_HARASSMENT - Negative or harmful comments targeting identity and/or protected attributes.

  • HARM_CATEGORY_HATE_SPEECH - Content that is rude, disrespectful, or profane.

  • HARM_CATEGORY_SEXUALLY_EXPLICIT - Contains references to sexual acts or other lewd content.

  • HARM_CATEGORY_DANGEROUS_CONTENT - Promotes, facilitates, or encourages harmful acts.

Levels

The different detection levels a are negligible, low, medium, and high. Through our testing the default if not specified is medium. We have had some business cases where that has been too limiting, and based on our context, limiting (we weren't taking user prompts, we were asking for silly business ideas, and that sometime triggered harassment or dangerous to medium).

The supported levels are specified for each category as one of the following -

  • BLOCK_LOW_AND_ABOVE

  • BLOCK_MEDIUM_AND_ABOVE

  • BLOCK_ONLY_HIGH

  • BLOCK_NONE


โ€‹

Did this answer your question?