14.5.28. crate_anon.nlp_manager.parse_substance_misuse

crate_anon/nlp_manager/parse_substance_misuse.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Python regex-based NLP processors for substance misuse.

class crate_anon.nlp_manager.parse_substance_misuse.AlcoholUnits(nlpdef: Optional[crate_anon.nlp_manager.nlp_definition.NlpDefinition], cfg_processor_name: Optional[str], commit: bool = False)[source]

SUBSTANCE MISUSE.

Alcohol consumption, specified explicitly as (UK) units per day or per week, or via non-numeric references to not drinking any.

  • Output is in UK units per week. A UK unit is 10 ml of ethanol 1 2. UK NHS guidelines used to be “per week” and remain broadly week-based 1.

  • It doesn’t attempt any understanding of other alcohol descriptions (e.g. “pints of beer”, “glasses of wine”, “bottles of vodka”) so is expected to apply where a clinician has converted a (potentially mixed) alcohol description to a units-per-week calculation.

1(1,2)

https://www.nhs.uk/live-well/alcohol-advice/calculating-alcohol-units/, accessed 2023-01-18.

2

https://en.wikipedia.org/wiki/Unit_of_alcohol

__init__(nlpdef: Optional[crate_anon.nlp_manager.nlp_definition.NlpDefinition], cfg_processor_name: Optional[str], commit: bool = False) None[source]

Init function for NumericalResultParser.

Parameters
  • nlpdef – A crate_anon.nlp_manager.nlp_definition.NlpDefinition.

  • cfg_processor_name – Config section name in the NLP config file.

  • variable – Used by subclasses as the record value for variable_name.

  • target_unit – Fieldname used for the primary output quantity.

  • regex_str_for_debugging – String form of regex, for debugging.

  • commit – Force a COMMIT whenever we insert data? You should specify this in multiprocess mode, or you may get database deadlocks.

Subclasses will extend this method.

parse(text: str, debug: bool = False) Generator[Tuple[str, Dict[str, Any]], None, None][source]

Parse for two regexes which operate slightly differently.

parse_alcohol_none(text: str, debug: bool = False) Generator[Tuple[str, Dict[str, Any]], None, None][source]

Deal with references to not drinking any alcohol (except those referred to as e.g. “0 units per week”, which will be picked up by the units-per-week function – that will be rare!).

parse_alcohol_units(text: str, debug: bool = False) Generator[Tuple[str, Dict[str, Any]], None, None][source]

We amend SimpleNumericalResultParser.parse() to deal with tense a bit better (e.g. “used to drink”). Comments from that version not repeated. That version also shortened a bit since we guarantee some aspects of the flags.

test(verbose: bool = False) None[source]

Performs a self-test on the NLP processor.

Parameters

verbose – Be verbose?

This is an abstract method that is subclassed.

class crate_anon.nlp_manager.parse_substance_misuse.AlcoholUnitsValidator(nlpdef: Optional[crate_anon.nlp_manager.nlp_definition.NlpDefinition], cfg_processor_name: Optional[str], commit: bool = False)[source]

Validator for AlcoholUnits (see help for explanation).