14.3.141. crate_anon.crateweb.research.research_db_info

crate_anon/crateweb/research/research_db_info.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Represents information about the structure of our research database(s).

class crate_anon.crateweb.research.research_db_info.ColumnInfo(**kwargs)[source]

Represents information about a database column, reflected from the database.

See also querybuilder.js.

__init__(**kwargs) None[source]
property basetype: str

Returns the SQL base type, such as VARCHAR.

property column_id: crate_anon.common.sql.ColumnId

Returns a crate_anon.common.sql.ColumnId describing this column.

property querybuilder_type: str

Returns a string that is defined in querybuilder.js and that defines our field type, like "int" or "date". See source.

property table_id: crate_anon.common.sql.TableId

Returns a crate_anon.common.sql.TableId describing this column’s table.

class crate_anon.crateweb.research.research_db_info.PatientFieldPythonTypes[source]

Represents Python types for each type of patient ID field.

MPID

alias of int

MRID

alias of str

PID

alias of int

RID

alias of str

TRID

alias of int

class crate_anon.crateweb.research.research_db_info.ResearchDatabaseInfo[source]

Fetches schema information from the research databases. There can be several, and this class represents the whole collection.

Notes:

  • Class primarily exists to be able to use @cached_property.

    • … replaced by @lru_cache

    • … replaced by @django_cache_function

__init__() None[source]
property dbs_with_secret_map: List[crate_anon.crateweb.research.research_db_info.SingleResearchDatabase]

Which of our databases has an associated secret lookup database to patient IDs?

Returns

a list of SingleResearchDatabase objects

property first_dbinfo: crate_anon.crateweb.research.research_db_info.SingleResearchDatabase

Returns the first SingleResearchDatabase that we know about.

property first_dbinfo_with_secret_map: Optional[crate_anon.crateweb.research.research_db_info.SingleResearchDatabase]

Returns the first SingleResearchDatabase that has a secret map (an associated secret lookup database to patient IDs), or None if there isn’t one.

get_dbinfo_by_name(name: str) crate_anon.crateweb.research.research_db_info.SingleResearchDatabase[source]

Returns the research database whose name (from settings.RESEARCH_DB_INFO) is name.

Parameters

name – the name of a database, as per settings.RESEARCH_DB_INFO

Returns

a SingleResearchDatabase

Raises

ValueError

get_dbinfo_by_schema_id(schema_id: crate_anon.common.sql.SchemaId) crate_anon.crateweb.research.research_db_info.SingleResearchDatabase[source]

Returns the first database representing the specified schema.

Parameters

schema_id – a crate_anon.common.sql.SchemaId

Returns

a SingleResearchDatabase

Raises

StopIteration

  • This is probably a functional duplicate of _get_db_info()!

get_default_database_name() str[source]

Returns the default “database name” for our dialect.

  • For Microsoft SQL Server, this is settings.DATABASES['research']['NAME']

  • For MySQL, this is blank

  • For PostgreSQL, this is blank

get_default_date_column(table: crate_anon.common.sql.TableId) Optional[crate_anon.common.sql.ColumnId][source]

Returns the default date column in the specified table (which may or may not exist).

Parameters

table – a crate_anon.common.sql.TableId

Returns

a crate_anon.common.sql.ColumnId, which may be blank

get_default_schema_name() str[source]

Returns the default “schema name” for our dialect.

  • For Microsoft SQL Server, this is 'dbo'

  • For PostgreSQL, this is 'public'

  • For MySQL, this is settings.DATABASES['research']['NAME']

get_excel() bytes[source]

Returns an XLSX (Excel) file detailing of all columns in all research databases.

Returns

binary XLSX file

Return type

bytes

get_linked_mrid_column(table: crate_anon.common.sql.TableId) Optional[crate_anon.common.sql.ColumnId][source]

Returns either (a) the MRID column in the schema containing the table specified, or (b) one that can be linked to it automatically.

Parameters

table – a crate_anon.common.sql.TableId

Returns

a crate_anon.common.sql.ColumnId, or None

get_mrid_column_from_schema(schema: crate_anon.common.sql.SchemaId) crate_anon.common.sql.ColumnId[source]

Returns the MRID column in the MRID master table for a given schema/database.

Parameters

schema – a crate_anon.common.sql.SchemaId

Returns

a crate_anon.common.sql.ColumnId, which may be blank

get_mrid_column_from_table(table: crate_anon.common.sql.TableId) crate_anon.common.sql.ColumnId[source]

Returns the MRID column in the specified table (which may or may not exist).

Parameters

table – a crate_anon.common.sql.TableId; this should be the MRID master table

Returns

a crate_anon.common.sql.ColumnId, which may be blank

get_rid_column(table: crate_anon.common.sql.TableId) crate_anon.common.sql.ColumnId[source]

Returns the RID column in the specified table (which may or may not exist).

Parameters

table – a crate_anon.common.sql.TableId

Returns

a crate_anon.common.sql.ColumnId, which may be blank

get_trid_column(table: crate_anon.common.sql.TableId) crate_anon.common.sql.ColumnId[source]

Returns the TRID column in the specified table (which may or may not exist).

Parameters

table – a crate_anon.common.sql.TableId

Returns

a crate_anon.common.sql.ColumnId, which may be blank

get_tsv() str[source]

Returns a tab-separated value (TSV) file detailing of all columns in all research databases.

Returns

TSV

Return type

str

property researchdb_schemas: List[crate_anon.common.sql.SchemaId]

Returns all crate_anon.common.sql.SchemaId values for our databases (one per database).

property single_research_db: bool

Do we have only a single research database?

property single_research_db_with_secret_map: bool

Do we have only one database that has an associated secret lookup database to patient IDs?

table_contains(table_id: crate_anon.common.sql.TableId, column_id: crate_anon.common.sql.ColumnId) bool[source]

Does the specified table contain the specified column?

Parameters
tables_containing_field(fieldname: str) List[crate_anon.common.sql.TableId][source]

Returns all tables containing the column (field) with the specified name.

Parameters

fieldname – field (column) name

Returns

a list of crate_anon.common.sql.TableId objects

  • We won’t use a SELECT on INFORMATION_SCHEMA here, since we already have the information.

text_columns(table_id: crate_anon.common.sql.TableId, min_length: int = 1) List[crate_anon.crateweb.research.research_db_info.ColumnInfo][source]

Returns all text columns from the specified table.

Parameters
Returns

a list of crate_anon.common.sql.ColumnInfo objects

classmethod uses_database_level() bool[source]

Does the database simultaneously offer a “database” level above its “schema” level?

class crate_anon.crateweb.research.research_db_info.SingleResearchDatabase(index: int, grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar, rdb_info: crate_anon.crateweb.research.research_db_info.ResearchDatabaseInfo, connection: django.db.backends.base.base.BaseDatabaseWrapper, vendor: str)[source]

Represents, and adds information to, a single entry from the RESEARCH_DB_INFO list: that is, a research database. (It’s a list because it’s ordered.)

__init__(index: int, grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar, rdb_info: crate_anon.crateweb.research.research_db_info.ResearchDatabaseInfo, connection: django.db.backends.base.base.BaseDatabaseWrapper, vendor: str) None[source]

Instantiates, reading database information as follows:

Parameters
  • index – Python zero-based index to settings.RESEARCH_DB_INFO

  • grammar – a cardinal_pythonlib.sql.sql_grammar.SqlGrammar, for the SQL dialect we’re using

  • rdb_info – a ResearchDatabaseInfo, describing the structure of the research database

  • connection – a django.db.backends.base.base.BaseDatabaseWrapper, i.e. a Django database connection

  • vendor – the Django database vendor name; see e.g. https://docs.djangoproject.com/en/2.1/ref/models/options/

can_communicate_directly(other: crate_anon.crateweb.research.research_db_info.SingleResearchDatabase) bool[source]

Can this database “talk” (link, join) to another?

Parameters

other – the other SingleResearchDatabase

Returns

True if they are the same database or share a common RID type

property colinfolist: List[crate_anon.crateweb.research.research_db_info.ColumnInfo]

Returns a list of ColumnInfo objects for our research database.

column_present(column_id: crate_anon.common.sql.ColumnId) bool[source]

Is the named column present in this database?

Parameters

column_id – a crate_anon.common.sql.ColumnId

property eligible_for_query_builder: bool

Is this database eligible to participate in the CRATE assisted query builder?

This is True if it’s the first database, or if it can link directly to the first database (shares a common RID), or if it and the first database share a common MRID.

get_default_date_field(table_id: crate_anon.common.sql.TableId) Optional[crate_anon.common.sql.ColumnId][source]

Gets the default date column for the specified table, or None if none exists.

Parameters

table_id – a crate_anon.common.sql.TableId

Returns

a crate_anon.common.sql.ColumnId, or None

get_schema_infodictlist(connection: django.db.backends.base.base.BaseDatabaseWrapper, vendor: str, debug: bool = False) List[Dict[str, Any]][source]

Fetch structure information for a specific database, by asking the database.

Parameters
Returns

A list of dictionaries, each mapping column names to values. The dictionaries are suitable for use as **kwargs to ColumnInfo.

property has_mrid: bool

Does this database have a defined master research ID (MRID), i.e. a table with one row per patient and one column that is the MRID?

property schema_identifier: str

Returns the SQL schema identifier.

property schema_infodictlist: List[Dict[str, Any]]

Discovers the schema. Returns the results of get_schema_infodictlist() for our connection and vendor. Implements caching.

property talks_to_world: bool

Does this database “talk to the world”, i.e. have an MRID?