12.5.15. crate_anon.nlp_manager.input_field_config
crate_anon/nlp_manager/input_field_config.py
Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.
Class to define input fields for NLP.
- class crate_anon.nlp_manager.input_field_config.InputFieldConfig(nlpdef: NlpDefinition, cfg_input_name: str)[source]
Class defining an input field for NLP (containing text).
See the documentation for the NLP config file.
- __init__(nlpdef: NlpDefinition, cfg_input_name: str) None[source]
Read config from a configparser section, and also associate with a specific NLP definition.
- Parameters:
nlpdef –
crate_anon.nlp_manager.nlp_definition.NlpDefinition, the master NLP definition, referring to the master config file etc.cfg_input_name – config section name for the input field definition
- delete_all_progress_records() None[source]
Deletes all records from the progress database for this NLP definition (across all source tables/columns).
- delete_progress_records_where_srcpk_not(temptable: Table | None) None[source]
If
temptableis None, deletes all progress records for this input field/NLP definition.If
temptableis a table, deletes records from the progress database (from this input field/NLP definition) whose source PK is not in the temporary table. (Used for deleting NLP records when the source has subsequently been deleted.)
- gen_src_pks() Generator[Tuple[int, str | None], None, None][source]
Generate integer PKs from the source table.
For tables with an integer PK, yields tuples:
pk_value, None.For tables with a string PK, yields tuples:
pk_hash, pk_value.Timing is subsumed under the timer named
TIMING_DELETE_WHERE_NO_SOURCE.
- gen_text(tasknum: int = 0, ntasks: int = 1) Generator[Tuple[str, Dict[str, Any]], None, None][source]
Generate text strings from the source database, for NLP. Text fields that are NULL, empty, or contain only whitespace, are skipped.
- Yields:
tuple –
text, dict, wheretextis the source text anddictis a column-to-value mapping for all other fields (source reference fields, copy fields).
- get_copy_columns() List[Column][source]
Returns the columns that the user has requested to be copied from the source table to the NLP destination table. The columns are ordered by name so that lists of columns from several tables can be compared .
- Returns:
a list of SQLAlchemy
Columnobjects, ordered by name.
- get_copy_indexes() List[Index][source]
Returns indexes that should be made in the destination table for columns that the user has requested to be copied from the source. The indexes are ordered by name so that lists of indexes from several tables can be compared.
- Returns:
a list of SQLAlchemy
Indexobjects, ordered by name.
- static get_core_columns_for_dest() List[Column][source]
Returns the columns used NLP destination tables, primarily describing the source. See Standard NLP output columns.
- Returns:
a list of SQLAlchemy
Columnobjects
- static get_core_indexes_for_dest(tablename: str, engine: Engine) List[Index][source]
Returns the core indexes to be applied to the destination tables. Primarily, these are for columns that refer to the source.
- Parameters:
tablename – The name of the table to be used in the destination.
engine – The destination database SQLAlchemy Engine.
- Returns:
a list of SQLAlchemy
Indexobjects
See - https://stackoverflow.com/questions/179085/multiple-indexes-vs-multi-column-indexes
- get_progress_record(srcpkval: int, srcpkstr: str | None = None) NlpRecord | None[source]
Fetch a progress record for the given source record, if one exists.
- Returns:
- property source_session: Session
Returns the SQLAlchemy ORM
Sessionfor the source database.
- property srcdatetimefield: str
Returns the name of the field (column) in the source table that defines the date/time of the source text.
- property srcdb: str
Returns the name of the source database.
- property srcfield: str
Returns the name of the text field (column) in the source table.
- property srcpkfield: str
Returns the name of the primary key (PK) field (column) in the source table.
- property srctable: str
Returns the name of the source table.