14.5.30. crate_anon.nlp_manager.regex_func
crate_anon/nlp_manager/regex_func.py
Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.
Functions to assist in building regular expressions.
2019-01-01: RM notes Ragel (https://en.wikipedia.org/wiki/Ragel) for embedding actions within a regex parser. Not immediately applicable here, I don’t think, but bear in mind.
- crate_anon.nlp_manager.regex_func.compile_regex(regex_str: str) Pattern [source]
Compiles a regular expression with our standard flags.
- crate_anon.nlp_manager.regex_func.compile_regex_dict(regexstr_to_value_dict: Dict[str, Any]) Dict[Pattern, Any] [source]
Converts a dictionary
{regex_str: value}
to a dictionary{compiled_regex: value}
.
- crate_anon.nlp_manager.regex_func.get_regex_dict_match(text: str | None, regex_to_value_dict: Dict[Pattern, Any], default: Any | None = None) Tuple[bool, Any] [source]
Checks text against a set of regular expressions. Returns whether there is a match, and if there was a match, the value that was associated (in the dictionary) with the matching regex.
(Note: “match”, as usual, means “match at the beginning of the string”.)
- Parameters:
text – text to test
regex_to_value_dict – dictionary mapping
{compiled_regex: value}
default – value to return if there is no match
- Returns:
matched, associated_value_or_default
- Return type:
tuple
- crate_anon.nlp_manager.regex_func.get_regex_dict_search(text: str | None, regex_to_value_dict: Dict[Pattern, Any], default: Any | None = None) Tuple[bool, Any] [source]
As for
get_regex_dict_match()
, but performs a search (find anywhere in the string) rather than a match.