6.6. Specific databases
Servelec RiO data exports come in several formats, including:
“RCEP”: preprocessed by Servelec’s RiO CRIS Extraction Program.
Individual organizations may process these too. CRATE provides a preprocessor (crate_preprocess_rio) to convert a RiO database to a format suitable for anonymisation via CRATE.
TPP provide a “strategic reporting extract” (SRE) containing SystmOne data. This contains structured data, but can contain free text too.
The structure of the SRE is good from CRATE’s perspective; it does not require reshaping for anonymisation.
The crate_preprocess_systmone will index a SystmOne source database (without which, anonymisation is very slow). It will also, optionally, create a view to add blurred geographical information, if you have used the the crate_postcodes tool to import UK Office for National Statistics geography data into a database.
NHS numbers, which are 10-digit integers incorporating a checksum, are
represented in our database copy of the SRE by the
VARCHAR(10) data type
(clearly a little suboptimal). It remains OK to use these in your
anonymiser config file:
sqlatype_mpid = BigInteger # # Within CPFT, we have some locally created columns with string versions of # the primary SystmOne ID, and so forth, so we use: # # sqlatype_pid = String(100) # sqlatype_mpid = String(100)
However, you will see some warnings during config checking. See sqlatype_mpid.
When generating a data dictionary, use these settings for your source database:
ddgen_omit_by_default = False # ... or use "--systemone_include_generic" with crate_anon_draft_dd # ... or use True if you want to hand-review everything ddgen_per_table_pid_field = IDPatient # ... largely cosmetic; improves the warnings if your local database # modifications have an odd structure.