12.3.146. crate_anon.crateweb.research.models

crate_anon/crateweb/research/models.py

Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.

class crate_anon.crateweb.research.models.ArchiveAttachmentAudit(*args, **kwargs)[source]

Audit log for access to an archive attachment.

exception DoesNotExist

exception MultipleObjectsReturned

class crate_anon.crateweb.research.models.ArchiveTemplateAudit(*args, **kwargs)[source]

Audit log for access to an archive template.

exception DoesNotExist

exception MultipleObjectsReturned

class crate_anon.crateweb.research.models.Highlight(*args, **kwargs)[source]

Represents the highlighting of a query.

exception DoesNotExist

exception MultipleObjectsReturned

activate() → None[source]: Mark this highlight as active.

static as_ordered_dict(highlight_list: Iterable[Highlight]) → Dict[int, List[Highlight]][source]

Converts a iterable of Highlight objects into a dictionary that collects them by highlight number.

Parameters:: highlight_list – list of Highlight objects
Returns:: an OrderedDict whose keys are highlight colour numbers (in ascending order), and whose values are lists of all the Highlight objects using that highlight colour number

deactivate() → None[source]: Mark this highlight as inactive.

static get_active_highlights(request: HttpRequest) → QuerySet[source]

Return all active highlights for the current user.

Parameters:: request – the django.http.request.HttpRequest
Returns:: a django.db.models.QuerySet of the Highlight objects

get_safe_colour() → int[source]: Returns our colour attribute, coerced to the range [0, N_CSS_HIGHLIGHT_CLASSES - 1] (inclusive).

class crate_anon.crateweb.research.models.PatientExplorer(*args, **kwargs)[source]

Class to explore the research database on a per-patient basis.

exception DoesNotExist

exception MultipleObjectsReturned

__init__(*args, **kwargs) → None[source]

activate() → None[source]: Activates this PatientExplorer (and deactivates any others).

all_queries(mrids: List[Any] | None = None) → List[TableQueryArgs][source]

Returns all queries from our patient_multiquery. See PatientMultiQuery.all_queries()

Parameters:: mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_multiquery’s PatientMultiQuery.patient_id_query().
Returns:: a list of TableQueryArgs objects (q.v.)
Return type:: list

audit(count_only: bool = False, n_records: int = 0, failed: bool = False, fail_msg: str = '') → None[source]

Audit the execution of this query:

insert an audit entry referring to this query
mark the query as having been audited (so it’s not deleted)

Parameters:

count_only – did we know (in advance) that this was a COUNT()-only query?
n_records – how many records were returned?
failed – did the query fail?
fail_msg – if the query failed, the associated failure message

property data_finder_excel: bytes

Returns an XSLX (Excel) file containing summary (count) information for each table.

See PatientMultiQuery.gen_data_finder_queries().

delete_if_permitted() → None[source]

Delete the query.

If a query has been executed and therefore audited, it isn’t properly deleted; it’s just marked as deleted.
If a query has never been executed, we can delete it entirely.

static get_active_pe_id_or_none(request: HttpRequest) → int | None[source]

Parameters:: request – the django.http.request.HttpRequest
Returns:: The integer PK of the active PatientExplorer for the user, or None.

static get_active_pe_or_none(request: HttpRequest) → PatientExplorer | None[source]

Parameters:: request – the django.http.request.HttpRequest
Returns:: The active PatientExplorer for the user, or None.

static get_executed_cursor(sql: str, args: List[Any] | None = None) → CursorWrapper[source]

Executes a query (via the research database) and returns its cursor.

Parameters:

sql – SQL text
args – arguments to SQL query

Returns:

a django.db.backends.utils.CursorWrapper, which is a context manager that behaves as the executed cursor and also closes it on completion

get_patient_id_query(with_order_by: bool = True) → str[source]

Returns SQL from our patient_multiquery’s PatientMultiQuery.patient_id_query() (q.v.).

Parameters:: with_order_by – see PatientMultiQuery.patient_id_query()

get_patient_mrids() → List[int][source]: Returns all MRIDs from our patient_multiquery’s PatientMultiQuery.patient_id_query().

get_xlsx_binary() → bytes[source]

Returns an XLSX (Excel) file containing spreadsheets, one for each table in our patient_multiquery.

Other notes:

cell size: see https://stackoverflow.com/questions/13197574/python-openpyxl-column-width-size-adjust; and the “auto_size” / “bestFit” options don’t really do the job, according to the interweb.

get_zipped_tsv_binary() → bytes[source]: Returns a ZIP file containing TSVs, one for each table in our patient_multiquery.

property has_output_columns: bool

Does our our patient_multiquery have output columns?

See PatientMultiQuery.has_output_columns().

property has_patient_id_query: bool

Does our our patient_multiquery have a patient ID query?

See PatientMultiQuery.has_patient_id_query().

mark_audited() → None[source]: Mark the query as having been executed and audited. (This prevents it from being wholly deleted.)

mark_deleted() → None[source]

Mark the query as deleted.

This will stop it being shown. It will not delete it from the database.

We use this deletion method for queries that have been executed, so need an audit trail.

save(*args, update_fields=None, **kwargs) → None[source]: Custom save method. Ensures that only one PatientExplorer has active == True for a given user. Also sets the hash.

property summary_html: str: Return HTML summarizing this object.

class crate_anon.crateweb.research.models.PatientExplorerAudit(*args, **kwargs)[source]

Audit log for a PatientExplorer.

exception DoesNotExist

exception MultipleObjectsReturned

class crate_anon.crateweb.research.models.PatientMultiQuery(output_columns: List[ColumnId] | None = None, patient_conditions: List[WhereCondition] | None = None, manual_patient_id_query: str = '')[source]

Represents a set of queries across many tables relating to one or several patients (but the same patients across all the tables).

Used for the Patient Explorer.

Development notes:

Patient ID query

Single database is easy; we can use RID or TRID, and therefore TRID for performance.

Note that UNION gives only DISTINCT results by default (UNION ALL gives everything); see https://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all.

-- Clear, but extensibility of boolean logic less clear:
SELECT trid
    FROM diagnosis_table
    WHERE diagnosis LIKE 'F20%'
INTERSECT
SELECT trid
    FROM progress_note_table
    WHERE note LIKE '%schizophreni%' OR note LIKE '%depression%'
ORDER BY trid
-- ... logic across tables requires careful arrangement of UNION vs. INTERSECT
-- ... logic for multiple fields within one table can be done with AND/OR

-- Slower (?), but simpler to manipulate logic?
SELECT DISTINCT something.trid
FROM diagnosis_table INNER JOIN progress_note_table
ON diagnosis_table.trid = progress_note_table.trid
WHERE
    diagnosis_table.diagnosis LIKE 'F20%'
    AND (progress_note_table.note LIKE '%schizophreni%'
         OR progress_note_table.notenote LIKE '%depression%')
ORDER BY something.trid
-- ... boolean logic can all be encapsulated in a single WHERE clause
-- ... can also share existing join code
-- ... ?reasonable speed since the TRID fields will be indexed
-- ... preferable.

Which ID for the patient ID query?
- the TRID (for speed, inc. sorting) of the first database
- can use the TRID from the first “where clause” table (don’t have to join to a master patient table)
- join everything across databases as before
Results queries
```
-- Something like:

SELECT rid, date_of_note, note
FROM progress_note_table
WHERE trid IN ( ... patient_id_query ... )
ORDER BY trid

SELECT rid, date_of_diagnosis, diagnosis, diagnosis_description
FROM diagnosis_table
WHERE trid IN ( ... patient_id_query ... )
ORDER BY trid
```
This means we will repeat the patient_id_query, which may be inefficient. Options:
- store the TRIDs in Python, then pass them as arguments
  - at which point the SQL string/packet length becomes relevant;
  - https://stackoverflow.com/questions/1869753/maximum-size-for-a-sql-server-query-in-clause-is-there-a-better-approach
  - https://stackoverflow.com/questions/16335011/what-is-maximum-query-size-for-mysql
  - https://stackoverflow.com/questions/96553/practical-limit-to-length-of-sql-query-specifically-mysql
- let the database worry about it
  - probably best for now!
Display
- One patient per page, with multiple results tables.
Boolean logic on patient selection
- … within

__init__(output_columns: List[ColumnId] | None = None, patient_conditions: List[WhereCondition] | None = None, manual_patient_id_query: str = '')[source]

Parameters:

output_columns – database columns that will be in the output, as list of crate_anon.common.sql.ColumnId objects
patient_conditions – restrictions on the patient, as a list of crate_anon.common.sql.WhereCondition objects; they will be joined with AND
manual_patient_id_query – raw SQL; if specified, overrides patient_conditions and is used as the patient-finding part of the query; see set_override_query()

add_output_column(column_id: ColumnId) → None[source]: Adds a database column to the output.

add_patient_condition(where: WhereCondition) → None[source]

Adds a patient WHERE condition.

Parameters:: where – a crate_anon.common.sql.WhereCondition

property all_full_queries: List[TableQueryArgs]

Returns all final queries. This is a list of multiple SQL queries, each retrieving information from one table, and all retrieving information for the same patient(s).

The patients we use are defined by our patient_id_query().

Returns:: a list of TableQueryArgs objects (q.v.)
Return type:: list

all_queries(mrids: List[Any] | None = None) → List[TableQueryArgs][source]

Returns all final queries. This is a list of multiple SQL queries, each retrieving information from one table, and all retrieving information for the same patient(s).

The patients we use are defined either by the MRID list given, or if that is empty or blank, our patient_id_query().

Parameters:: mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_id_query().
Returns:: a list of TableQueryArgs objects (q.v.)
Return type:: list

all_queries_specific_patients(mrids: List[int]) → List[TableQueryArgs][source]

Returns all final queries. This is a list of multiple SQL queries, each retrieving information from one table, and all retrieving information for the same patient(s).

The patients we use are defined by the MRID list given.

Parameters:: mrids – list of MRIDs
Returns:: a list of TableQueryArgs objects (q.v.)
Return type:: list

clear_output_columns() → None[source]: Removes all output columns from the multiquery.

clear_patient_conditions() → None[source]: Removes all WHERE conditions on the patient.

gen_data_finder_queries(mrids: List[Any] | None = None) → Generator[TableQueryArgs, None, None][source]

Generates a set of queries that, when executed, return the following summary columns from each of our tables, filtered for patients by our where_patient_clause(), and grouped by master_research_id (MRID):

master_research_id,
table_name,
COUNT(*) AS n_records,
MIN(date_column) AS min_date,  -- NULL if no date column
MAX(date_column) AS max_date   -- NULL if no date column

These queries can be used to see quickly which tables have interesting information in.

Parameters:: mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_id_query().
Yields:: TableQueryArgs objects (q.v.)

gen_monster_queries(mrids: List[int] | None = None) → Generator[TableQueryArgs, None, None][source]

Generates a set of queries that, when executed, return SELECT * from each of our tables, filtered for patients by our where_patient_clause(). So it’s like the basic Patient Explorer but with all columns in the output.

These queries are used in the Patient Explorer “Monster Data” view.

Parameters:: mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_id_query().
Yields:: TableQueryArgs objects (q.v.)

property has_output_columns: bool: Does this multiquery have any output columns?

property has_patient_id_query: bool: Does this multiquery have a patient ID query? This can either be one that the user has specified manually, or one built from WHERE conditions that appears to refer to an MRID.

property hash64: int: Return an integer (non-cryptographic) hash of the query.

make_query(table_id: TableId, columns: List[ColumnId], mrids: List[Any] | None = None) → TableQueryArgs[source]

Returns an SQL query to retrieve information from a single table for certain patients. This query is similar to SELECT a, b, c FROM sometable WHERE sometable.mrid IN (1, 2, 3) or SELECT a, b, c FROM sometable WHERE sometable.mrid IN (SELECT mrid FROM masterpatienttable). This then forms one query from (potentially) many for our patient(s).

Parameters:

table_id – a crate_anon.common.sql.TableId for the SELECT FROM table
columns – columns, specified as a list of crate_anon.common.sql.ColumnId, to select from the table (in addition to which, we will always select the MRID column from that table)
mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_id_query().

Returns:

a TableQueryArgs object (q.v.)

property manual_patient_id_query: str: Returns the manual override SQL for the patient ID query.

property ok_to_run: bool: Is this OK to run, i.e. does it have a patient ID query and some output columns?

property output_cols_html: str: Returns all our output columns in HTML format.

property output_columns: List[ColumnId]: Returns the output columns, as a list of crate_anon.common.sql.ColumnId objects.

property patient_conditions: List[WhereCondition]: Returns all WHERE conditions restricting the patient, as a list of crate_anon.common.sql.WhereCondition objects.

patient_id_query(with_order_by: bool = True) → str[source]

Returns an SQL SELECT statement based on the list of WHERE conditions already stored, joined with AND by default. (If a manual patient ID query has been specified, return that instead.)

Parameters:: with_order_by – add an ORDER BY query on the MRID; such an ordering is important for consistency across runs (but is prohibited by SQL Server in subqueries – “The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, … unless TOP, OFFSET or FOR XML is specified.”)
Returns:: SQL
Return type:: str

property pt_conditions_html: str: Returns all our patient WHERE conditions in HTML format.

set_override_query(query: str) → None[source]

Sets the manual override SQL for the patient ID query.

Parameters:: query – raw SQL

This query should return a single column of MRID values that is fetched into Python and used to restrict other queries. Here’s a fictional example to fetch the MRIDs for all patients who have the word “neutrophils” in their notes:

SELECT DISTINCT anonymous_output.patient.nhshash AS _mrid
FROM anonymous_output.patient
INNER JOIN anonymous_output.note ON anonymous_output.note.trid = anonymous_output.patient.trid
WHERE MATCH (anonymous_output.note.note) AGAINST ('neutrophils')
    AND anonymous_output.patient.nhshash IS NOT NULL
ORDER BY _mrid

summary_html(element_counter: HtmlElementCounter) → str[source]

Returns an HTML representation of this multiquery.

Parameters:: element_counter – a crate_anon.crateweb.research.html_functions.HtmlElementCounter, which will be modified
Returns:: HTML
Return type:: str

where_patient_clause(table_id: TableId, mrids: List[Any] | None = None) → Tuple[str, List[Any]][source]

Returns an SQL WHERE clauses similar to sometable.mrid IN (1, 2, 3) or sometable.mridcol IN (SELECT mrid FROM masterpatienttable). The clause is used to restrict patients by MRID.

Parameters:

table_id – crate_anon.common.sql.TableId for the table whose MRID column we will apply the WHERE clause to
mrids – list of MRIDs; if this is None or empty, use the patients fetched (live) by our patient_id_query().

Returns:

sql, args

Return type:

tuple

class crate_anon.crateweb.research.models.PidLookup(*args, **kwargs)[source]

Lookup class for secret RID-to-PID conversion.

Used via one or other of the ‘secret’ database connections.
Intended for READ-ONLY access to that table.
Since we have fixed the tablenames for the anonymiser, we remove the settings.SECRET_MAP option. See crate_anon.anonymise.models.PatientInfo. Moreover, we fix the maximum length, regardless of the specifics of the config used.
Use as e.g. Lookup(pid=XXX).

exception DoesNotExist

exception MultipleObjectsReturned

save(*args, **kwargs) → None[source]

Save the current instance. Override this in a subclass if you want to control the saving process.

The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.

class crate_anon.crateweb.research.models.Query(*args, **kwargs)[source]

Class to query the research database.

exception DoesNotExist

exception MultipleObjectsReturned

__init__(*args, **kwargs) → None[source]: Initialize our cache.

activate() → None[source]: Activate this query (and deactivates any others).

audit(count_only: bool = False, n_records: int = 0, failed: bool = False, fail_msg: str = '') → None[source]

Audit the execution of this query:

insert an audit entry referring to this query
mark the query as having been audited (so it’s not deleted)

Parameters:

count_only – did we know (in advance) that this was a COUNT()-only query?
n_records – how many records were returned?
failed – did the query fail?
fail_msg – if the query failed, the associated failure message

delete_if_permitted() → None[source]

Delete the query.

If a query has been executed and therefore audited, it isn’t properly deleted; it’s just marked as deleted.
If a query has never been executed, we can delete it entirely.

dictfetchall() → List[Dict[str, Any]][source]

Executes the query and returns all results as a list of OrderedDicts (one for each row, mapping column names to values).

Raises:: DatabaseError –

gen_display_rows() → Generator[List[Any], None, None][source]: Generates all filtered rows.

gen_rows() → Generator[List[Any], None, None][source]

Generate rows from the query.

Raises:: DatabaseError –

static get_active_query_id_or_none(request: HttpRequest) → int | None[source]

Returns the active query’s integer ID for this user, or None.

Parameters:: request – the django.http.request.HttpRequest
Returns:: the active query’s integer PK, or None.

static get_active_query_or_none(request: HttpRequest) → Query | None[source]

Returns the active query for this user, or None.

Parameters:: request – the django.http.request.HttpRequest
Returns:: a Query, or None.

get_column_names() → List[str][source]

Returns column names from the query’s cursor.

Raises:: DatabaseError –

get_display_column_names() → List[str][source]: Returns the filtered column names.

get_display_indexes() → List[int] | None[source]

Returns the indexes of the result columns that we wish to display. Uses _get_display_indexes() and caches it.

Raises:: DatabaseError –

get_display_list() → List[str][source]: Returns a list of columns to display, from our internal JSON representation. Uses _get_display_list() and caches it.

get_display_rows() → List[List[Any]][source]: Returns a list of all filtered rows.

get_executed_cursor() → CursorWrapper[source]

Get cursor with a query executed (based on our attributes sql, args, raw, qmark).

Returns:: a django.db.backends.utils.CursorWrapper

Do NOT use this with with, as in:

with query.get_executed_cursor() as cursor:
    # do stuff

You could do that (and in general it’s what Django advises) but we are trying to be fancy here and use the cursor more efficiently.

get_rowcount() → int[source]

Returns the rowcount from the cursor.

Raises:: DatabaseError –

get_rows() → List[List[Any]][source]

Returns all rows from the query, as a list.

Raises:: DatabaseError –

get_sql_args_for_django() → Tuple[str, List[Any] | None][source]

Get sql/args in a format suitable for Django, with %s placeholders, or as escaped raw SQL.

Returns:: sql, args
Return type:: tuple

If raw is set, return our raw SQL with % escaped to %%;
otherwise, if qmark is set, return our raw SQL with ? argument placeholders translated to %s argument placeholders;
otherwise, return the raw SQL.

make_excel() → bytes[source]: Executes the query and returns an Excel workbook, in binary.

make_tsv() → str[source]: Executes the query and returns a TSV result (as a multiline string).

mark_audited() → None[source]: Mark the query as having been executed and audited. (This prevents it from being wholly deleted.)

mark_deleted() → None[source]

Mark the query as deleted.

This will stop it being shown. It will not delete it from the database.

We use this deletion method for queries that have been executed, so need an audit trail.

save(*args, update_fields=None, **kwargs) → None[source]: Custom save method. Ensures that only one Query has active == True for a given user. Also sets the hash.

set_display_list(display_list: List[str]) → None[source]

Sets the internal JSON field, stored in the database, from a list of column headings to display.

Parameters:: display_list – list of columns to display

class crate_anon.crateweb.research.models.QueryAudit(*args, **kwargs)[source]

Audit log for a query.

exception DoesNotExist

exception MultipleObjectsReturned

class crate_anon.crateweb.research.models.QueryBase(*args, **kwargs)[source]

Abstract base class for the two query classes.

get_formatted_sql() → str[source]: Getter for ‘formatted_sql’.

get_original_sql() → str[source]: Returns the stored raw SQL.

set_formatted_sql(reformat: bool = False) → None[source]: Sets ‘formatted_sql’ by highlighting the syntax and possibly reformatting.

class crate_anon.crateweb.research.models.SitewideQuery(*args, **kwargs)[source]

Class representing a site-wide query for research database.

Site-wide queries are not attached to any particular user.
They are templatized with placeholders.
Placeholders begin with [[ and end with ]].
The user is asked to fill in values for the placeholders.

exception DoesNotExist

exception MultipleObjectsReturned

property prettified_chunks: List[str]: Returns chunks (see sql_chunks) but with formatting.

save(*args, update_fields=None, **kwargs) → None[source]: Custom save method. Sets the hash.

property sql_chunks: List[str]

Returns a list of SQL chunks and placeholders made from the original SQL. Placeholders begin with [[ and end with ]].

For example, if the sql is

SELECT * FROM [[table]] WHERE brcid="[[brcid]]";

then sql_chunks will be

[
    'SELECT * FROM ',
    'table',
    ' WHERE brcid="',
    'brcid',
    '";'
]

Note that the first element (and all elements with even [zero-based] list indexes) are SQL, not placeholders. All elements with odd indexes are placeholders.

class crate_anon.crateweb.research.models.TableQueryArgs(table_id: TableId, sql: str, args: List[Any])[source]

Represents SQL for a specific table, with arguments for the SQL. Used by PatientMultiQuery.

__init__(table_id: TableId, sql: str, args: List[Any]) → None[source]

Parameters:

table_id – a crate_anon.common.sql.TableId that this query is selecting from
sql – SQL text
args – a list of arguments to the SQL

crate_anon.crateweb.research.models.debug_query() → None[source]: Executes a test query that just selects a constant, using the research database (i.e. connections['research']).

crate_anon.crateweb.research.models.gen_excel_row_elements(worksheet: Worksheet, row: Iterable) → Generator[Any, None, None][source]

Given an Excel worksheet row, generate individual cell contents, cell by cell.

Parameters:

worksheet – a openpyxl.worksheet.worksheet.Worksheet, which we need in order to find the worksheet’s encoding
row – the row to iterate through

Yields:

the contents of each cell

Reasons for this function:

We need a tuple/list/generator, as openpyxl checks its types manually.

We want to have a Worksheet object from openpyxl, and say something like
ws.append(row)
where “row” has come from a database query.
However, openpyxl doesn’t believe in duck-typing; see Worksheet.append() in openpyxl/worksheet/worksheet.py. So sometimes the plain append works (e.g. from MySQL results), but sometimes it fails, e.g. when the row is of type pyodbc.Row.

So we must coerce it to a tuple, list, or generator.

A generator will be the most efficient.

If a string fails certain checks, openpyxl will raise an IllegalCharacterError exception. We need to work around that. We’ll use the “forgiveness, not permission” maxim. Specifically, it dislikes strings matching its ILLEGAL_CHARACTERS_RE, which contains unprintable low characters matching this:
r'[\000-\010]|[\013-\014]|[\016-\037]'
Note the use of octal; \037 is decimal 31.

openpyxl gets to its Cell.check_string() function for these types:
STRING_TYPES = (basestring, unicode, bytes)
In Python 3, this means (str, str, bytes). So we should check str and bytes. (For bytes, we’ll follow its method of converting to str in the encoding of the worksheet’s choice.)

crate_anon.crateweb.research.models.get_executed_researchdb_cursor(sql: str, args: List[Any] | None = None) → CursorWrapper[source]

Executes a query on the research database. Returns a wrapped cursor that can be used as a context manager that will close the cursor on completion.

Parameters:

sql – SQL text
args – arguments to SQL query

Returns:

a django.db.backends.utils.CursorWrapper, which is a context manager that behaves as the executed cursor and also closes it on completion

Test code:

import os
import django
os.environ['DJANGO_SETTINGS_MODULE'] = 'crate_anon.crateweb.config.settings'
django.setup()
from crate_anon.crateweb.research.models import *
c = get_executed_researchdb_cursor("SELECT 1")

crate_anon.crateweb.research.models.get_executed_researchdb_cursor_qmark_placeholders(sql: str, args: List[Any] | None = None) → CursorWrapper[source]: As for get_executed_researchdb_cursor(), but assumes its SQL may contain question-mark parameter placeholders (?) and translates these to the ones we need internally.

crate_anon.crateweb.research.models.get_mpid(dbinfo: SingleResearchDatabase, trid: int | None = None, rid: str | None = None, mrid: str | None = None) → int[source]

Returns the MPID for a patient, looked up from one of the research IDs.

Parameters:

dbinfo – a crate_anon.crateweb.research.research_db_info.SingleResearchDatabase
trid – optional transient research identifier (TRID) value
rid – optional research identifier (RID) value
mrid – optional master research identifier (MRID) value

Returns:

the integer MPID, or None

Raises:

ValueError –

crate_anon.crateweb.research.models.get_pid(dbinfo: SingleResearchDatabase, trid: int | None = None, rid: str | None = None, mrid: str | None = None) → int[source]

Returns the PID for a patient, looked up from one of the research IDs.

Parameters:

dbinfo – a crate_anon.crateweb.research.research_db_info.SingleResearchDatabase
trid – optional transient research identifier (TRID) value
rid – optional research identifier (RID) value
mrid – optional master research identifier (MRID) value

Returns:

the integer PID, or None

Raises:

ValueError –

Looks up a patient in the secret lookup database associated with a database, from one of several possible identifiers.

Parameters:

dbinfo – a crate_anon.crateweb.research.research_db_info.SingleResearchDatabase
pid – optional patient identifier (PID) value
mpid – optional master patient identifier (MPID) value
trid – optional transient research identifier (TRID) value
rid – optional research identifier (RID) value
mrid – optional master research identifier (MRID) value

Returns:

a crate_anon.crateweb.research.models.PidLookup or None

Raises:

ValueError –

crate_anon.crateweb.research.models.hack_django_pyodbc_azure_cursorwrapper() → None[source]

Monkey-patch part of the sql_server.pyodbc library from django-pyodbc-azure. It replaces the fetchone() method with a version that doesn’t call cursor.nextset() automatically.

It looks like this becomes unnecessary in django-pyodbc-azure==2.0.6.1 or similar, because the call to ``cursor.nextset()`` is now only performed ``if not self.connection.supports_mars``.

Notes

I thought I wanted to modify an instance, not a class (https://tryolabs.com/blog/2013/07/05/run-time-method-patching-python/).
To modify a class, we do SomeClass.method = newmethod.
But to modify an instance, we use instance.method = types.MethodType(newmethod, instance).
However, it turned out the instance was actually part of a long chain of cursor wrappers, including the Django debug toolbar. Classes included debug_toolbar.panels.sql.tracking.NormalCursorWrapper; django.db.backends.utils.CursorDebugWrapper. And in any case, modifying the class is a sensible thing.

crate_anon.crateweb.research.models.replacement_sqlserver_pyodbc_cursorwrapper_fetchone(self) → List[Any][source]: A function to replace CursorWrapper.fetchone() in sql_server/pyodbc/base.py from django-pyodbc-azure. This replacement function does not call cursor.nextset().