query_build.html

{# crate_anon/crateweb/research/templates/query_build.html #}

{% extends "base.html" %}

{% block collapsejs %}{% endblock %}
{% block extrahead %}
    {% include "querybuilder_extrahead.html" %}
{% endblock %}
{% block onload %}onload="populate()"{% endblock %}

{% block content %}

    {% comment %}
        {% include "query_nav.html" %}
    {% endcomment %}

    <h1>Build a simple query</h1>
    <p>For more advanced queries, see the
        <a href="{% url 'query' %}">SQL view</a>.</p>

    <h2>Current query</h2>
    {% if parse_error %}
        <div class="warning">
            SQL FAILED TO PARSE. Please clear the query or edit it manually in
            the SQL view. Error was: {{ parse_error }}
        </div>
    {% endif %}
    {% if sql %}
        <div class="sql">{{ sql|safe }}</div>

        <form action="{% url 'build_query' %}" method="post">
            {% csrf_token %}
            <input type="submit" name="global_clear" value="Clear" />
            {% if not parse_error %}
                <input type="submit" name="global_toggle_distinct"
                       value="Toggle DISTINCT" />
                <input type="submit" name="global_save" value="Save as query" />
                <input type="submit" name="global_run" value="Run" />
            {% endif %}
        </form>
    {% else %}
        {# single space to show the area visually #}
        <pre class="sql"> </pre>
    {% endif %}

    <h2>Build your query!</h2>

    {% include "querybuilder_form.html" with form_submit_url='build_query' %}

    <h2>Tips</h2>
    <ul>
        {% if dialect_mysql %}
            <li>This server is running MySQL. (A database and a schema
            are synonymous under MySQL, so the heirarchy is
            schema, table, column.)</li>
        {% elif dialect_mssql %}
            <li>This server is running Microsoft SQL Server. (The hierarchy
            is: database, schema, table, column.)</li>
        {% else %}
            <li><b>BUG: UNKNOWN SQL DIALECT.</b></li>
        {% endif %}
        <li>Inexact string comparison can be done in several ways:
            <ul>
                {% if dialect_mysql %}
                    <li><a href="https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html">LIKE</a>,
                        in which ‘%’ stands for any number of characters, and ‘_’
                        for one unknown character. For example, use
                        <span class="code">%schizophreni%</span> to find text
                        containing ‘schizophrenia’, ‘schizophrenic’, or
                        ‘schizophreniform’.</li>
                    <li><a href="https://dev.mysql.com/doc/refman/5.7/en/fulltext-search.html">MATCH</a>,
                        which is much faster than LIKE. MATCH can be used if there
                        is a full-text index on the column in question. You can
                        MATCH AGAINST words or phrases.</li>
                    <li><a href="https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html">REGEXP</a>,
                        using <a href="https://en.wikipedia.org/wiki/Regular_expression">regular expressions</a>.
                        This is complicated; see a <a href="https://regexone.com/">tutorial</a>
                        and use a <a href="https://regexr.com/">regular expression tester</a>.</li>
                {% endif %}

                {% if dialect_mssql %}
                    <li><a href="https://msdn.microsoft.com/en-us/library/ms179859.aspx">LIKE</a>,
                        in which ‘%’ stands for any number of characters, and ‘_’
                        for one unknown character. For example, use
                        <span class="code">%schizophreni%</span> to find text
                        containing ‘schizophrenia’, ‘schizophrenic’, or
                        ‘schizophreniform’. You can also match (or anti-match)
                        character groups; see the
                        <a href="https://msdn.microsoft.com/en-us/library/ms179859.aspx">help</a>.</li>
                    <li><a href="https://msdn.microsoft.com/en-us/library/ms187787.aspx">CONTAINS</a>,
                        which is much faster than LIKE. CONTAINS can be used if there
                        is a full-text index on the column in question. You can
                        use CONTAINS with words or phrases.</li>
                {% endif %}
            </ul></li>

        <li>This query builder parses the SQL you start with, and then
            adds more pieces. It is less powerful than arbitrary SQL.</li>

        <li>
            Things that are are better done with raw SQL include:
            <ul>
                <li>Combining expressions with OR, or in complex ways.</li>
                <li>Restricting to a set of values with IN.</li>
                <li>Finding information about patients P that “have” something
                    (e.g. a diagnosis) using EXISTS.</li>
            </ul>
        </li>

        <li>If you use a file (for IN or NOT IN clauses), this file should be
            in ASCII or UTF-8 format, with one value per line. Lines will
            be stripped of whitespace (left + right). Lines starting with
            a # character will be treated as comments and ignored.</li>

        <li>Browsers: Firefox doesn’t support date pickers (as of
        version 50.0.2, Nov 2016). Chrome does. See
        <a href="https://caniuse.com/#feat=input-datetime">https://caniuse.com/#feat=input-datetime</a>.
        This page hacks in date picker support for browsers as required.</li>

    </ul>

    <h2>Explanation</h2>
    <ul>
        <li>This query builder <b>only looks at tables that contain patient
            information</b> (as judged by the presence of the field used
            to auto-join tables, as defined by your database manager).
            It ignores “system” tables with no connection to a patient.
            If you want to explore all the tables, explore the database
            structure using the options on the
            <a href="{% url 'home' %}">Main menu</a>, and use the
            <a href="{% url 'query' %}">SQL view</a> to create a more complex
            query.</li>

        <li>The query builder will automatically JOIN tables based on a shared
            field (i.e. records for the same patient).
            This field is typically the “transient research ID” (TRID).
            The TRID is a numerical value that identifies a patient.
            It’s quicker for queries than the “research ID” (RID), but the RID
            will stay constant if/when the research database is rebuilt, and
            the TRID might change, so you should make a note of the RID if you
            want to save data.
            The RID is typically an encrypted patient identifier from the
            source database; the TRID is taken from a one-time pad.</li>

        <li>The query builder will also automatically join tables across
            schemas/databases. If the administrator has said that patient
            tables in the databases use a common research ID (RID), that field
            will be used. [Example: both databases use the same patient
            numbers, and they have been hashed in the same way.]
            Otherwise, if they both have tables containing a master research
            ID (MRID), that is used to bridge the databases.
            [Example: two databases use different patient numbers, but they
            both contain a table with a hashed UK NHS number.]</li>

        <li><b>NOTE, though, that JOINs are many-to-many unless otherwise
            restricted.</b> So if a patient has 10 values for C-reactive
            protein, and 20 progress notes mentioning depression, a simple
            JOIN will give 10 * 20 = 200 rows. You can easily create very
            large queries by accident. If you want to do something more
            sophisticated, like “show me the progress notes THAT mention
            depression for patients WHO HAVE a CRP of >20 at some point”,
            you want a query more like “SELECT rid, progress_note FROM
            my_progress_note_table
            WHERE CONTAINS(progress_note, 'depression') AND rid IN
            (SELECT DISTINCT rid FROM my_crp_table WHERE crp > 20)”.
            Use the SQL editor to create such queries.</li>

    </ul>

{% endblock %}

{% block helpurl %}{{ HelpUrl.querybuilder }}{% endblock %}