14.5.10. crate_anon.nlp_manager.cloud_parser

crate_anon/nlp_manager/cloud_parser.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Send text to a cloud-based NLPRP server for processing.

Todo

cloud_parser: handle new tabular_schema info from server

class crate_anon.nlp_manager.cloud_parser.Cloud(nlpdef: NlpDefinition | None, cfg_processor_name: str | None, commit: bool = False)[source]

EXTERNAL.

Abstract NLP processor that passes information to a remote (cloud-based) NLP system via the NLPRP protocol. The processor at the other end might be of any kind.

__init__(nlpdef: NlpDefinition | None, cfg_processor_name: str | None, commit: bool = False) None[source]
Parameters:
  • nlpdefcrate_anon.nlp_manager.nlp_definition.NlpDefinition

  • cfg_processor_name – the config section for the processor

  • commit – force a COMMIT whenever we insert data? You should specify this in multiprocess mode, or you may get database deadlocks.

dest_tables_columns() Dict[str, List[Column]][source]

Describes the destination table(s) that this NLP processor wants to write to.

Returns:

a dictionary of {tablename: destination_columns}, where destination_columns is a list of SQLAlchemy Column objects.

Return type:

dict

dest_tables_indexes() Dict[str, List[Index]][source]

Describes indexes that this NLP processor suggests for its destination table(s).

Returns:

a dictionary of {tablename: indexes}, where indexes is a list of SQLAlchemy Index objects.

Return type:

dict

static get_coltype_parts(coltype_str: str) List[str][source]

Get root column type and parameter, i.e. for VARCHAR(50) root column type is VARCHAR and parameter is 50.

is_tabular() bool[source]

Is the format of the schema information given by the remote processor tabular?

set_procinfo_if_correct(remote_processor: ServerProcessor) None[source]

Checks if a processor dictionary, with all the nlprp specified info a processor should have, belongs to this processor. If it does, then we add the information from the procesor dictionary.

static str_to_coltype_general(coltype_str: str) Type[TypeEngine][source]

Get the sqlalchemy column type class which fits with the column type.