14.5.55. crate_anon.nlp_manager.tests.regex_test_helperfunc

crate_anon/nlp_manager/tests/regex_test_helperfunc.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Functions for testing regular expressions.

crate_anon.nlp_manager.tests.regex_test_helperfunc.assert_text_regex(name: str, regex_text: str, test_expected_list: List[Tuple[str, List[str]]], verbose: bool = False) None[source]

Test a regex upon some text.

Parameters:
  • name – regex name (for display purposes only)

  • regex_text – text that should be compiled to give our regex

  • test_expected_list – list of tuples teststring, expected_results, where teststring is some text and expected_results is a list of expected hits for the regex within teststring

  • verbose – be verbose?

Returns:

crate_anon.nlp_manager.tests.regex_test_helperfunc.f_score(precision: float, recall: float, beta: float = 1) float[source]

Calculates an F score (e.g. an F1 score for beta == 1). See https://en.wikipedia.org/wiki/F1_score.

Parameters:
  • precision – precision of the test, P(really positive | test positive)

  • recall – recall of the test, P(test positive | really positive)

  • beta – controls the type of the F score (the relative emphasis on precision versus recall)

Returns:

the F score

crate_anon.nlp_manager.tests.regex_test_helperfunc.get_compiled_regex_results(compiled_regex: Pattern, text: str) List[str][source]

Finds all the hits for a regex when applied to text.

Parameters:
  • compiled_regex – a compiled regular expression

  • text – text to parse

Returns:

a list of all the (entire) hits for this regex in text

crate_anon.nlp_manager.tests.regex_test_helperfunc.print_compiled_regex_results(compiled_regex: Pattern, text: str, prefix_spaces: int = 4) None[source]

Applies a regex to text and prints (to stdout) all its hits.

Parameters:
  • compiled_regex – a compiled regular expression

  • text – text to parse

  • prefix_spaces – number of spaces to begin each answer with

crate_anon.nlp_manager.tests.regex_test_helperfunc.run_tests_nlp_and_validator_classes(all_nlp_and_validators: List[Tuple[BaseNlpParser, ValidatorBase]]) None[source]

Tests multiple pairs of NLP classes and their associated validators.