14.5.33. crate_anon.nlp_manager.regex_read_codes

crate_anon/nlp_manager/regex_read_codes.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Regular expressions to detect some Read codes (CTV3).

See https://en.wikipedia.org/wiki/Read_code.

class crate_anon.nlp_manager.regex_read_codes.ReadCode(read_code: str, phrases: List[str] | None = None)[source]

Represents information about the way a quantity is represented as a Read code.

NOTE: Read codes are case-sensitive. (See https://www.gp-training.net/it/read-codes/.)

It would be desirable to mark the Read code as case-sensitive, within a regex that is case-insensitive overall. Apparently Tcl supports this via the (?c) flag: https://www.regular-expressions.info/modifiers.html.

However, others just support the “locally case-insensitive” flag, (?i).

Python (via regex) fails to parse the test regex (?i)te(?-i)st, from https://www.regular-expressions.info/modifiers.html. It gives the error regex._regex_core.error: bad inline flags: cannot turn flags off at position 11. No docs at https://pypi.org/project/regex/ or https://docs.python.org/3/library/re.html suggest otherwise.

Since we absolutely want case-insensitive matching for the most part, I think we’ll live with this limitation.

__init__(read_code: str, phrases: List[str] | None = None) None[source]
Parameters:
  • read_code – The Read (CTV3) code, a string of length 5.

  • phrases – The associated possible phrases.

component_regex_strings() List[str][source]

A list of regular expression strings representing this quantity.

Provides regexes for:

phrase (readcode)
phrase
regex_str() str[source]

A single composite regex string representing this quantity.

class crate_anon.nlp_manager.regex_read_codes.ReadCodes[source]

Some known Read codes.

From v3ReadCode_PBCL.xlsx.

crate_anon.nlp_manager.regex_read_codes.any_read_code_of(*read_codes: ReadCode) str[source]

Returns a regex allowing any of the specified Read codes.

crate_anon.nlp_manager.regex_read_codes.regex_components_from_read_codes(*read_codes: ReadCode) List[str][source]

Returns all components from the specified Read code objects.