14.1.24. crate_anon.anonymise.subset_db
crate_anon/anonymise/subset_db.py
Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.
Create a simple subset of a database.
- class crate_anon.anonymise.subset_db.DatabaseFilterSource(name: str, url: str, table: str, column: str, echo: bool = False)[source]
- __init__(name: str, url: str, table: str, column: str, echo: bool = False) None
- class crate_anon.anonymise.subset_db.SubsetConfig(src_db_url: str, dst_db_url: str, filter_column: str | None = None, filter_values: List[str] | None = None, filter_value_filenames: List[str] | None = None, filter_value_db_urls: List[str] | None = None, filter_value_tablecols: List[str] | None = None, include_rows_filtercol_null: bool = False, include_tables_without_filtercol: bool = True, include_tables: List[str] | None = None, include_table_filenames: List[str] | None = None, exclude_tables: List[str] | None = None, exclude_table_filenames: List[str] | None = None, echo: bool = False)[source]
Simple configuration class for subsetting databases.
- __init__(src_db_url: str, dst_db_url: str, filter_column: str | None = None, filter_values: List[str] | None = None, filter_value_filenames: List[str] | None = None, filter_value_db_urls: List[str] | None = None, filter_value_tablecols: List[str] | None = None, include_rows_filtercol_null: bool = False, include_tables_without_filtercol: bool = True, include_tables: List[str] | None = None, include_table_filenames: List[str] | None = None, exclude_tables: List[str] | None = None, exclude_table_filenames: List[str] | None = None, echo: bool = False) None [source]
- Parameters:
src_db_url – SQLAlchemy URL for the source database.
dst_db_url – SQLAlchemy URL for the destination database.
filter_column – Name of column to filter on (e.g. “patient_id”). If blank, might copy everything.
filter_values – Values, treated as strings, to accept.
filter_value_filenames – Filename(s), containing values, treated as strings, to accept.
include_rows_filtercol_null – Allow the filter column to be NULL as well?
include_tables_without_filtercol – Include tables that don’t possess the filter column (e.g. system/lookup tables)?
include_tables – Specific named tables to include.
include_table_filenames – Filename(s), containin specific named tables to include.
exclude_tables – Specific named tables to exclude.
exclude_table_filenames – Filename(s), containin specific named tables to exclude.
echo – Echo SQL (debugging only)?
- permit_table_name(table_name: str) bool [source]
Should this table be permitted (judging only by its name)?
- property safe_dst_db_url: str
Password-obscured version of the destination database URL.
- property safe_src_db_url: str
Password-obscured version of the source database URL.
- class crate_anon.anonymise.subset_db.Subsetter(cfg: SubsetConfig)[source]
Class to take a subset of data from one database to another.
- __init__(cfg: SubsetConfig) None [source]
- contains_filter_col(table_name: str) bool [source]
Does this table contain our target filter column?
- drop_dst_table_if_exists(table_name: str) None [source]
Drop a table on the destination side. Also remove it from the destination metadata, so we can recreate it (if necessary) without complaint.
- dst_sqla_table(table_name: str) Table [source]
Returns the SQLAlchemy Table from the destination database.
- gen_filtered_rows(table_name: str) Generator[Row, None, None] [source]
Generate filtered source rows from the database.
- gen_src_rows(table_name: str) Generator[Row, None, None] [source]
Generate unfiltered source rows from the database.
- permit_table(table_name: str) bool [source]
Is this table name permitted to go through to the destination?