14.1.19. crate_anon.anonymise.models

crate_anon/anonymise/models.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


SQLAlchemy ORM models for the CRATE anonymiser, representing information it stores in its admin database.

To create a SQLAlchemy Table programmatically:

To create a SQLAlchemy ORM programmatically:

class crate_anon.anonymise.models.OptOutMpid(**kwargs)[source]

Records the MPID values of patients opting out of the anonymised database.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

classmethod add(session: Session, mpid: int | str) None[source]

Add a record of a patient who wishes to opt out.

Parameters:
  • session – SQLAlchemy database session for the secret admin database

  • mpid – MPID of the patient who is opting out

classmethod opting_out(session: Session, mpid: int | str) bool[source]

Is this patient opting out?

Parameters:
  • session – SQLAlchemy database session for the secret admin database

  • mpid – MPID of the patient to test

Returns:

opting out?

class crate_anon.anonymise.models.OptOutPid(**kwargs)[source]

Records the PID values of patients opting out of the anonymised database.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

classmethod add(session: Session, pid: int | str) None[source]

Add a record of a patient who wishes to opt out.

Parameters:
  • session – SQLAlchemy database session for the secret admin database

  • pid – PID of the patient who is opting out

classmethod opting_out(session: Session, pid: int | str) bool[source]

Is this patient opting out?

Parameters:
  • session – SQLAlchemy database session for the secret admin database

  • pid – PID of the patient to test

Returns:

opting out?

class crate_anon.anonymise.models.PatientInfo(**kwargs)[source]

Represent patient information in the secret admin database.

Design decision in this class:

  • It gets too complicated if you try to make the fieldnames arbitrary and determined by the config.

  • So we always use ‘pid’, ‘rid’, etc.

    • Older config settings that this decision removes:

      mapping_patient_id_fieldname
      mapping_master_id_fieldname
      
    • Note that the following are still actively used, as they can be used to set the names in the OUTPUT database (not the mapping database):

      research_id_fieldname
      trid_fieldname
      master_research_id_fieldname
      source_hash_fieldname
      
  • The config is allowed to set three column types:

    • the source PID type (e.g. INT, BIGINT, VARCHAR)

    • the source MPID type (e.g. BIGINT)

    • the encrypted (RID, MRID) type, which is set by the encryption algorithm; e.g. VARCHAR(128) for SHA-512.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

ensure_rid() None[source]

Ensure that rid is a hashed version of pid.

ensure_trid(session: Session) None[source]

Ensure that trid is a suitable transient research ID (TRID): the TRID we have already generated for this PID, or a fresh random integer that we’ll remember.

Parameters:

session – SQLAlchemy database session for the secret admin database

set_mpid(mpid: int | str) None[source]

Sets the MPID, and at the same time, the MRID (a hashed version of the MPID).

Parameters:

mpid – master patient ID (MPID) value

set_scrubber_info(scrubber: PersonalizedScrubber) None[source]

Sets our scrubber_hash to be the hash of the scrubber passed as a parameter.

If our crate_anon.anonymise.config.Config has its save_scrubbers flag set, then we also save the textual regex string for the patient scrubber and the third-party scrubber.

Parameters:

scrubbercrate_anon.anonymise.scrub.PersonalizedScrubber

class crate_anon.anonymise.models.TridRecord(**kwargs)[source]

Records the mapping from patient ID (PID) to integer transient research ID (TRID), and makes new TRIDs as required.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

classmethod get_trid(session: Session, pid: int | str) int[source]

Looks up the PID in the database and returns its corresponding TRID. If there wasn’t one, make a new one, store the mapping, and return the new TRID.

Parameters:
  • session – SQLAlchemy database session for the secret admin database

  • pid – patient ID (PID) value

Returns:

integer TRID

classmethod new_trid(session: Session, pid: int | str) int[source]

Creates a new TRID: a random integer that’s not yet been used as a TRID.

We check for existence by inserting and asking the database if it’s happy, not by asking the database if it exists (since other processes may be doing the same thing at the same time).