14.3.146. crate_anon.crateweb.research.research_db_info
crate_anon/crateweb/research/research_db_info.py
Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.
Represents information about the structure of our research database(s).
- class crate_anon.crateweb.research.research_db_info.ColumnInfo(**kwargs)[source]
Represents information about a database column, reflected from the database.
See also
querybuilder.js
.- property basetype: str
Returns the SQL base type, such as
VARCHAR
.
- property column_id: ColumnId
Returns a
crate_anon.common.sql.ColumnId
describing this column.
- property querybuilder_type: str
Returns a string that is defined in
querybuilder.js
and that defines our field type, like"int"
or"date"
. See source.
- property table_id: TableId
Returns a
crate_anon.common.sql.TableId
describing this column’s table.
- class crate_anon.crateweb.research.research_db_info.PatientFieldPythonTypes[source]
Represents Python types for each type of patient ID field.
- MPID
alias of
int
- MRID
alias of
str
- PID
alias of
int
- RID
alias of
str
- TRID
alias of
int
- class crate_anon.crateweb.research.research_db_info.ResearchDatabaseInfo[source]
Fetches schema information from the research databases. There can be several, and this class represents the whole collection.
Notes:
Class primarily exists to be able to use
@cached_property
.… replaced by
@lru_cache
… replaced by
@django_cache_function
- property dbs_with_secret_map: List[SingleResearchDatabase]
Which of our databases has an associated secret lookup database to patient IDs?
- Returns:
a list of
SingleResearchDatabase
objects
- property first_dbinfo: SingleResearchDatabase
Returns the first
SingleResearchDatabase
that we know about.
- property first_dbinfo_with_secret_map: SingleResearchDatabase | None
Returns the first
SingleResearchDatabase
that has a secret map (an associated secret lookup database to patient IDs), orNone
if there isn’t one.
- get_dbinfo_by_name(name: str) SingleResearchDatabase [source]
Returns the research database whose name (from
settings.RESEARCH_DB_INFO
) isname
.- Parameters:
name – the name of a database, as per
settings.RESEARCH_DB_INFO
- Returns:
- Raises:
ValueError –
- get_dbinfo_by_schema(schema_id: SchemaId) SingleResearchDatabase [source]
Returns the first database representing the specified schema.
- Parameters:
schema_id – a
crate_anon.common.sql.SchemaId
- Returns:
- Raises:
ValueError –
- get_default_database_name() str [source]
Returns the default “database name” for our dialect.
For Microsoft SQL Server, this is
settings.DATABASES['research']['NAME']
For MySQL, this is blank
For PostgreSQL, this is blank
- get_default_date_column(table: TableId) ColumnId | None [source]
Returns the default date column in the specified table (which may or may not exist).
- Parameters:
table – a
crate_anon.common.sql.TableId
- Returns:
a
crate_anon.common.sql.ColumnId
, which may be blank
- get_default_schema_name() str [source]
Returns the default “schema name” for our dialect.
For Microsoft SQL Server, this is
'dbo'
For PostgreSQL, this is
'public'
For MySQL, this is
settings.DATABASES['research']['NAME']
- get_excel() bytes [source]
Returns an XLSX (Excel) file detailing of all columns in all research databases.
- Returns:
binary XLSX file
- Return type:
bytes
- get_linked_mrid_column(table: TableId) ColumnId | None [source]
Returns either (a) the MRID column in the schema containing the table specified, or (b) one that can be linked to it automatically.
- Parameters:
table – a
crate_anon.common.sql.TableId
- Returns:
a
crate_anon.common.sql.ColumnId
, orNone
- get_mrid_column_from_schema(schema: SchemaId) ColumnId [source]
Returns the MRID column in the MRID master table for a given schema/database.
- Parameters:
schema – a
crate_anon.common.sql.SchemaId
- Returns:
a
crate_anon.common.sql.ColumnId
, which may be blank
- get_mrid_column_from_table(table: TableId) ColumnId [source]
Returns the MRID column in the specified table (which may or may not exist).
- Parameters:
table – a
crate_anon.common.sql.TableId
; this should be the MRID master table- Returns:
a
crate_anon.common.sql.ColumnId
, which may be blank
- get_rid_column(table: TableId) ColumnId [source]
Returns the RID column in the specified table (which may or may not exist).
- Parameters:
table – a
crate_anon.common.sql.TableId
- Returns:
a
crate_anon.common.sql.ColumnId
, which may be blank
- get_trid_column(table: TableId) ColumnId [source]
Returns the TRID column in the specified table (which may or may not exist).
- Parameters:
table – a
crate_anon.common.sql.TableId
- Returns:
a
crate_anon.common.sql.ColumnId
, which may be blank
- get_tsv() str [source]
Returns a tab-separated value (TSV) file detailing of all columns in all research databases.
- Returns:
TSV
- Return type:
str
- property researchdb_schemas: List[SchemaId]
Returns all
crate_anon.common.sql.SchemaId
values for our databases (one per database).
- property single_research_db: bool
Do we have only a single research database?
- property single_research_db_with_secret_map: bool
Do we have only one database that has an associated secret lookup database to patient IDs?
- table_contains(table_id: TableId, column_id: ColumnId) bool [source]
Does the specified table contain the specified column?
- Parameters:
table_id – a
crate_anon.common.sql.TableId
column_id – a
crate_anon.common.sql.ColumnId
- tables_containing_field(fieldname: str) List[TableId] [source]
Returns all tables containing the column (field) with the specified name.
- Parameters:
fieldname – field (column) name
- Returns:
a list of
crate_anon.common.sql.TableId
objects
We won’t use a
SELECT
onINFORMATION_SCHEMA
here, since we already have the information.
- text_columns(table_id: TableId, min_length: int = 1) List[ColumnInfo] [source]
Returns all text columns from the specified table.
- Parameters:
table_id – a
crate_anon.common.sql.TableId
min_length – the minimum SQL text length to include the column
- Returns:
a list of
crate_anon.common.sql.ColumnInfo
objects
- class crate_anon.crateweb.research.research_db_info.SingleResearchDatabase(index: int, grammar: SqlGrammar, rdb_info: ResearchDatabaseInfo, connection: BaseDatabaseWrapper, vendor: str)[source]
Represents, and adds information to, a single entry from the
RESEARCH_DB_INFO
list: that is, a research database. (It’s a list because it’s ordered.)- __init__(index: int, grammar: SqlGrammar, rdb_info: ResearchDatabaseInfo, connection: BaseDatabaseWrapper, vendor: str) None [source]
Instantiates, reading database information as follows:
the relevant dictionary is
settings.RESEARCH_DB_INFO[index]
the keys to that dictionary are in
crate_anon.crateweb.config.constants.ResearchDbInfoKeys
- Parameters:
index – Python zero-based index to
settings.RESEARCH_DB_INFO
grammar – a
cardinal_pythonlib.sql.sql_grammar.SqlGrammar
, for the SQL dialect we’re usingrdb_info – a
ResearchDatabaseInfo
, describing the structure of the research databaseconnection – a
django.db.backends.base.base.BaseDatabaseWrapper
, i.e. a Django database connectionvendor – the Django database vendor name; see e.g. https://docs.djangoproject.com/en/2.1/ref/models/options/
- can_communicate_directly(other: SingleResearchDatabase) bool [source]
Can this database “talk” (link, join) to another?
- Parameters:
other – the other
SingleResearchDatabase
- Returns:
True
if they are the same database or share a common RID type
- property colinfolist: List[ColumnInfo]
Returns a list of
ColumnInfo
objects for our research database.
- column_present(column_id: ColumnId) bool [source]
Is the named column present in this database?
- Parameters:
column_id – a
crate_anon.common.sql.ColumnId
- property eligible_for_query_builder: bool
Is this database eligible to participate in the CRATE assisted query builder?
This is
True
if it’s the first database, or if it can link directly to the first database (shares a common RID), or if it and the first database share a common MRID.
- get_default_date_field(table_id: TableId) ColumnId | None [source]
Gets the default date column for the specified table, or
None
if none exists.- Parameters:
table_id – a
crate_anon.common.sql.TableId
- Returns:
a
crate_anon.common.sql.ColumnId
, orNone
- get_schema_infodictlist(connection: BaseDatabaseWrapper, vendor: str, debug: bool = False) List[Dict[str, Any]] [source]
Fetch structure information for a specific database, by asking the database.
- Parameters:
connection – a
django.db.backends.base.base.BaseDatabaseWrapper
, i.e. a Django database connectionvendor – the Django database vendor name; see e.g. https://docs.djangoproject.com/en/2.1/ref/models/options/
debug – be verbose to the log?
- Returns:
A list of dictionaries, each mapping column names to values. The dictionaries are suitable for use as
**kwargs
toColumnInfo
.
- property has_mrid: bool
Does this database have a defined master research ID (MRID), i.e. a table with one row per patient and one column that is the MRID?
- property schema_identifier: str
Returns the SQL schema identifier.
- property schema_infodictlist: List[Dict[str, Any]]
Discovers the schema. Returns the results of
get_schema_infodictlist()
for our connection and vendor. Implements caching.
- property talks_to_world: bool
Does this database “talk to the world”, i.e. have an MRID?