18. Things to do

Todo

Check minimal anonymiser config example works.

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/anonymisation/anon_config.rst, line 1820.)

Todo

Check minimal data dictionary example works.

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/anonymisation/data_dictionary.rst, line 634.)

Todo

pe_one_table

Might it be better to feed the resulting query back into the main Query system, allowing users to turn columns on/off, etc.?

At present it forces query_id to None and this is detected by query_result.html.

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/crateweb/research/views.py:docstring of crate_anon.crateweb.research.views.pe_one_table, line 14.)

Todo

cloud_parser: handle new tabular_schema info from server

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/nlp_manager/cloud_parser.py:docstring of crate_anon.nlp_manager.cloud_parser, line 27.)

Todo

preprocess_rio: specific supposed PK failing (non-unique) on incremental

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/preprocess/preprocess_rio.py:docstring of crate_anon.preprocess.preprocess_rio, line 32.)

Todo

preprocess_rio: Imperfectly tested: Audit_Created_Date, Audit_Updated_Date … some data for Audit_Created_Date, but incomplete audit table

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/preprocess/preprocess_rio.py:docstring of crate_anon.preprocess.preprocess_rio, line 35.)

Todo

preprocess_rio: Similarly, all cross-checks to RCEP output (currently limited by data availability)

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/preprocess/preprocess_rio.py:docstring of crate_anon.preprocess.preprocess_rio, line 39.)

Todo

TestPatient column missing in CPFT copy. [A/w NP 2022-03-21.]

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/envs/latest/lib/python3.8/site-packages/crate_anon/preprocess/systmone_ddgen.py:docstring of crate_anon.preprocess.systmone_ddgen, line 376.)

Todo

Check if DISABLE_DJANGO_PYODBC_AZURE_CURSOR_FETCHONE_NEXTSET with more recent of django-pyodbc-azure (and if not necessary, document successful version).

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/installation/database_drivers.rst, line 467.)

Todo

fix below here; see CamCOPS help

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/installation/docker.rst, line 391.)

Todo

fuzzy_id_match: expand on method

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/linkage/fuzzy_id_match.rst, line 35.)

Todo

fuzzy_id_match: cite paper when published

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/linkage/fuzzy_id_match.rst, line 37.)

Todo

NLPRP: consider supra-document processing requirements

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/nlp/nlprp.rst, line 1490.)

Todo

add screenshots

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/crateanon/checkouts/latest/docs/source/website_using/research_queries.rst, line 40.)

fix bug (reported by JL 6/11/2018) where the RiO preprocessor tries to put the same column into the same index more than once (see RNC email 6/11/2018)
BENCHMARK name denial (with forenames + surnames – English words – eponyms): speed, precision, recall. Share results with MB.
Personal configurable highlight colours (with default set if none configured)? Or just more colours? Look at a standard highlighter pack – e.g.
- https://www.jetpens.com/Stabilo-Boss-Original-Highlighter-9-Color-Bundle/pd/21976.
- https://www.rapidtables.com/web/css/css-color.html
- Yellow, https://hexcolor.co/hex/ffff00
- Blue, maybe cornflowerblue, https://hexcolor.co/hex/6495ed
- Green, maybe https://hexcolor.co/hex/72ff66 (“Stabilo Boss 1”)
- Lavender, e.g. https://hexcolor.co/hex/967bb6
- Lilac pink, maybe magenta-ish, e.g. https://hexcolor.co/hex/ff66e5 (“Stabilo Boss 2”)
- Orange, e.g. https://hexcolor.co/hex/ffa500
- Pink, e.g. https://hexcolor.co/hex/f24c7c
- Red, e.g. https://hexcolor.co/hex/ff0000
- Turquoise blue, no idea which one that is, but consider https://hexcolor.co/hex/0ac768 (“Stabilo:)”)
- Note default browser Ctrl-F colours; see base.css.
More of JL’s ideas from 8 Jan 2018:
- A series of functions like fn_age(rid), fn_is_alive(rid), fn_open_referral(rid)
- Friendly names for the top 10 most used tables, which might appear at the top of the tables listing.
When the Windows service stops, it is still failing to kill child processes. See crate_anon/tools/winservice.py.
NLP protocol revision whereby processors describe their output fields, saying which SQL dialect they’re using; and (automatic) implementation for our built-in NLP.
There’s some placeholder junk in consent_lookup_result.html.