6.7. Using CRATE’s anonymisation API web server

The CRATE anonymisation API uses the CRATE web interface. See Configuring the CRATE web interface and Using the CRATE web interface.

You can access the anonymisation API menu at /anon_api/. You do not need to be logged in.

The API endpoint is at /anon_api/scrub/

6.8. Anonymisation API documentation

CRATE API

CRATE API (0.0.1)

Download OpenAPI specification:Download

Clinical Records Anonymisation and Text Extraction (CRATE)

schema

schema_retrieve

OpenApi3 schema for this API. Format can be selected via content negotiation.

  • YAML: application/vnd.oai.openapi
  • JSON: application/vnd.oai.openapi+json
Authorizations:
query Parameters
format
string
Enum: "json" "yaml"
lang
string
Enum: "af" "ar" "ar-dz" "ast" "az" "be" "bg" "bn" "br" "bs" "ca" "cs" "cy" "da" "de" "dsb" "el" "en" "en-au" "en-gb" "eo" "es" "es-ar" "es-co" "es-mx" "es-ni" "es-ve" "et" "eu" "fa" "fi" "fr" "fy" "ga" "gd" "gl" "he" "hi" "hr" "hsb" "hu" "hy" "ia" "id" "ig" "io" "is" "it" "ja" "ka" "kab" "kk" "km" "kn" "ko" "ky" "lb" "lt" "lv" "mk" "ml" "mn" "mr" "my" "nb" "ne" "nl" "nn" "os" "pa" "pl" "pt" "pt-br" "ro" "ru" "sk" "sl" "sq" "sr" "sr-latn" "sv" "sw" "ta" "te" "tg" "th" "tk" "tr" "tt" "udm" "uk" "ur" "uz" "vi" "zh-hans" "zh-hant"

Responses

Response Schema:
property name*
any

Response samples

Content type
No sample

scrub

scrub_create

Main CRATE anonymisation end-point.

Authorizations:
Request Body schema: application/json
required
object

The lines of text to be scrubbed, each keyed on a unique ID supplied by the caller.

object

Specific patient data to be scrubbed.

object

Third party (e.g. family members') data to be scrubbed.

anonymise_codes_at_word_boundaries_only
boolean
Default: true

Ensure the codes to be scrubbed begin and end with a word boundary.

anonymise_dates_at_word_boundaries_only
boolean
Default: true

Ensure the codes to be scrubbed begin and end with a word boundary.

anonymise_numbers_at_word_boundaries_only
boolean
Default: false

Ensure the numbers to be scrubbed begin and end with a word boundary.

anonymise_numbers_at_numeric_boundaries_only
boolean
Default: true

Ensure the numbers to be scrubbed begin and end with a numeric boundary.

anonymise_strings_at_word_boundaries_only
boolean
Default: true

Ensure the numbers to be scrubbed begin and end with a word boundary.

string_max_regex_errors
integer
Default: 0

The maximum number of typographical insertion/deletion/substitution errors to permit.

min_string_length_for_errors
integer
Default: 3

The minimum string length at which typographical errors will be permitted.

min_string_length_to_scrub_with
integer
Default: 2

Do not scrub strings shorter than this length.

scrub_string_suffixes
Array of strings
Default: []

A list of suffixes to permit on strings. e.g. ["s"] for plural forms.

object

Allowlist options.

object

Denylist options.

replace_patient_info_with
string
Default: "[__PPP__]"

Replace sensitive patient content with this.

replace_third_party_info_with
string
Default: "[__TTT__]"

Replace sensitive third party (e.g. family members') content with this.

replace_nonspecific_info_with
string
Default: "[~~~]"

Replace any other sensitive content with this.

replace_all_dates_with
string

When scrubbing all dates, replace with this text. If the replacement text includes supported datetime.directives (%b, %B, %m, %Y, %y), the date is 'blurred' to include just those components.

scrub_all_numbers_of_n_digits
Array of integers[ items ]
Default: []

Scrub all numbers with these lengths (e.g. [10] for all UK NHS numbers).

scrub_all_uk_postcodes
boolean
Default: false

Scrub all UK postcodes.

scrub_all_dates
boolean
Default: false

Scrub all dates. Currently assumes the default locale for month names and ordinal suffixes.

scrub_all_email_addresses
boolean
Default: false

Scrub all e-mail addresses.

alternatives
Array of Array of strings[ items ]
Default: [[]]

List of alternative words to scrub. e.g.: [["Street", "St"], ["Road", "Rd"], ["Avenue", "Ave"]]

Responses

Response Schema: application/json
required
object

The lines of text to be scrubbed, each keyed on a unique ID supplied by the caller.

object

Specific patient data to be scrubbed.

object

Third party (e.g. family members') data to be scrubbed.

anonymise_codes_at_word_boundaries_only
boolean
Default: true

Ensure the codes to be scrubbed begin and end with a word boundary.

anonymise_dates_at_word_boundaries_only
boolean
Default: true

Ensure the codes to be scrubbed begin and end with a word boundary.

anonymise_numbers_at_word_boundaries_only
boolean
Default: false

Ensure the numbers to be scrubbed begin and end with a word boundary.

anonymise_numbers_at_numeric_boundaries_only
boolean
Default: true

Ensure the numbers to be scrubbed begin and end with a numeric boundary.

anonymise_strings_at_word_boundaries_only
boolean
Default: true

Ensure the numbers to be scrubbed begin and end with a word boundary.

string_max_regex_errors
integer
Default: 0

The maximum number of typographical insertion/deletion/substitution errors to permit.

min_string_length_for_errors
integer
Default: 3

The minimum string length at which typographical errors will be permitted.

min_string_length_to_scrub_with
integer
Default: 2

Do not scrub strings shorter than this length.

scrub_string_suffixes
Array of strings
Default: []

A list of suffixes to permit on strings. e.g. ["s"] for plural forms.

object

Allowlist options.

object

Denylist options.

replace_patient_info_with
string
Default: "[__PPP__]"

Replace sensitive patient content with this.

replace_third_party_info_with
string
Default: "[__TTT__]"

Replace sensitive third party (e.g. family members') content with this.

replace_nonspecific_info_with
string
Default: "[~~~]"

Replace any other sensitive content with this.

replace_all_dates_with
string

When scrubbing all dates, replace with this text. If the replacement text includes supported datetime.directives (%b, %B, %m, %Y, %y), the date is 'blurred' to include just those components.

scrub_all_numbers_of_n_digits
Array of integers[ items ]
Default: []

Scrub all numbers with these lengths (e.g. [10] for all UK NHS numbers).

scrub_all_uk_postcodes
boolean
Default: false

Scrub all UK postcodes.

scrub_all_dates
boolean
Default: false

Scrub all dates. Currently assumes the default locale for month names and ordinal suffixes.

scrub_all_email_addresses
boolean
Default: false

Scrub all e-mail addresses.

alternatives
Array of Array of strings[ items ]
Default: [[]]

List of alternative words to scrub. e.g.: [["Street", "St"], ["Road", "Rd"], ["Avenue", "Ave"]]

required
object

The anonymised text, keyed on the unique IDs supplied by the caller in the 'text' parameter of the request.

Request samples

Content type
application/json
{
  • "text": {
    },
  • "patient": {
    },
  • "third_party": {
    },
  • "anonymise_codes_at_word_boundaries_only": true,
  • "anonymise_dates_at_word_boundaries_only": true,
  • "anonymise_numbers_at_word_boundaries_only": false,
  • "anonymise_numbers_at_numeric_boundaries_only": true,
  • "anonymise_strings_at_word_boundaries_only": true,
  • "string_max_regex_errors": 0,
  • "min_string_length_for_errors": 3,
  • "min_string_length_to_scrub_with": 2,
  • "scrub_string_suffixes": [ ],
  • "allowlist": {
    },
  • "denylist": {
    },
  • "replace_patient_info_with": "[__PPP__]",
  • "replace_third_party_info_with": "[__TTT__]",
  • "replace_nonspecific_info_with": "[~~~]",
  • "replace_all_dates_with": "string",
  • "scrub_all_numbers_of_n_digits": [ ],
  • "scrub_all_uk_postcodes": false,
  • "scrub_all_dates": false,
  • "scrub_all_email_addresses": false,
  • "alternatives": [
    ]
}

Response samples

Content type
application/json
{
  • "text": {
    },
  • "patient": {
    },
  • "third_party": {
    },
  • "anonymise_codes_at_word_boundaries_only": true,
  • "anonymise_dates_at_word_boundaries_only": true,
  • "anonymise_numbers_at_word_boundaries_only": false,
  • "anonymise_numbers_at_numeric_boundaries_only": true,
  • "anonymise_strings_at_word_boundaries_only": true,
  • "string_max_regex_errors": 0,
  • "min_string_length_for_errors": 3,
  • "min_string_length_to_scrub_with": 2,
  • "scrub_string_suffixes": [ ],
  • "allowlist": {
    },
  • "denylist": {
    },
  • "replace_patient_info_with": "[__PPP__]",
  • "replace_third_party_info_with": "[__TTT__]",
  • "replace_nonspecific_info_with": "[~~~]",
  • "replace_all_dates_with": "string",
  • "scrub_all_numbers_of_n_digits": [ ],
  • "scrub_all_uk_postcodes": false,
  • "scrub_all_dates": false,
  • "scrub_all_email_addresses": false,
  • "alternatives": [
    ],
  • "anonymised": {
    }
}