14.2.17. crate_anon.common.sql

crate_anon/common/sql.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Low-level SQL manipulation functions.

class crate_anon.common.sql.ColumnId(db: str = '', schema: str = '', table: str = '', column: str = '')[source]

Represents a database column.

__init__(db: str = '', schema: str = '', table: str = '', column: str = '') None[source]
Parameters
  • db – database name

  • schema – schema name

  • table – table name

  • column – column name

property column: str

Returns the column part.

property db: str

Returns the database part.

property has_table_and_column: bool

Do we know about a table and a column?

property is_valid: bool

Do we know about a table and a column, at least?

property schema: str

Returns the schema part.

property schema_id: crate_anon.common.sql.SchemaId

Returns a SchemaId for the schema of our column.

property table: str

Returns the table part.

property table_id: crate_anon.common.sql.TableId

Returns a TableId for our table.

class crate_anon.common.sql.IndexCreationInfo(index_name: str, column: Union[str, List[str]], unique: bool = False)[source]
__init__(index_name: str, column: Union[str, List[str]], unique: bool = False) None
column: Union[str, List[str]]

Column name(s) to index

index_name: str

Name of the index

unique: bool = False

Make a unique index?

class crate_anon.common.sql.JoinInfo(table: str, join_type: str = 'INNER JOIN', join_condition: str = '')[source]

Object to represent a SQL join condition in a simple way.

__init__(table: str, join_type: str = 'INNER JOIN', join_condition: str = '') None[source]
Parameters
  • table – table to be joined in

  • join_type – join method, e.g. "INNER JOIN"

  • join_condition – join condition, e.g. "ON x = y"

class crate_anon.common.sql.SchemaId(db: str = '', schema: str = '')[source]

Represents a database schema. This is a bit complex:

__init__(db: str = '', schema: str = '') None[source]
Parameters
  • db – database name

  • schema – schema name

column_id(table: str, column: str) crate_anon.common.sql.ColumnId[source]

Returns a ColumnId combining this schema and the specified table/column.

Parameters
  • table – name of the table

  • column – name of the column

property db: str

Returns the database part.

classmethod from_schema_tag(tag: str) crate_anon.common.sql.SchemaId[source]

Returns a SchemaId from a tag of the form db.schema.

identifier(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns an SQL identifier for this schema using the specified SQL grammar, quoting it if need be.

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

property schema: str

Returns the schema part.

property schema_tag: str

String suitable for encoding the SchemaId e.g. in a single HTML form. Takes the format database.schema.

The __init__() function has already checked the assumption of no '.' characters in either part.

table_id(table: str) crate_anon.common.sql.TableId[source]

Returns a TableId combining this schema and the specified table.

Parameters

table – name of the table

class crate_anon.common.sql.TableId(db: str = '', schema: str = '', table: str = '')[source]

Represents a database table.

__init__(db: str = '', schema: str = '', table: str = '') None[source]
Parameters
  • db – database name

  • schema – schema name

  • table – table name

column_id(column: str) crate_anon.common.sql.ColumnId[source]

Returns a ColumnId combining this table and the specified column.

Parameters

column – name of the column

database_schema_part(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns an SQL identifier for this table’s database/schema (without the table part) using the specified SQL grammar, quoting it if need be.

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

property db: str

Returns the database part.

identifier(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns an SQL identifier for this table using the specified SQL grammar, quoting it if need be.

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

property schema: str

Returns the schema part.

property schema_id: crate_anon.common.sql.SchemaId

Returns a SchemaId for the schema of our table.

property table: str

Returns the table part.

table_part(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns an SQL identifier for this table’s table name (only) using the specified SQL grammar, quoting it if need be.

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

class crate_anon.common.sql.TransactionSizeLimiter(session: sqlalchemy.orm.session.Session, max_rows_before_commit: Optional[int] = None, max_bytes_before_commit: Optional[int] = None)[source]

Class to allow us to limit the size of database transactions.

__init__(session: sqlalchemy.orm.session.Session, max_rows_before_commit: Optional[int] = None, max_bytes_before_commit: Optional[int] = None) None[source]
Parameters
  • session – SQLAlchemy database Session

  • max_rows_before_commit – how many rows should we insert before triggering a COMMIT? None for no limit.

  • max_bytes_before_commit – how many bytes should we insert before triggering a COMMIT? None for no limit.

commit() None[source]

Performs a database COMMIT and resets our counters.

(Measures some timing information, too.)

notify(n_rows: int, n_bytes: int, force_commit: bool = False) None[source]

Use this function to notify the limiter of data that you’ve inserted into the database. If the total number of rows or bytes exceeds a limit that we’ve set, this will trigger a COMMIT.

Parameters
  • n_rows – number of rows inserted

  • n_bytes – number of bytes inserted

  • force_commit – force a COMMIT?

class crate_anon.common.sql.ViewMaker(viewname: str, engine: sqlalchemy.engine.base.Engine, basetable: str, existing_to_lower: bool = False, rename: Optional[Dict[str, str]] = None, userobj: Optional[Any] = None, enforce_same_n_rows_as_base: bool = True, insert_basetable_columns: bool = True)[source]

View-building assistance class.

__init__(viewname: str, engine: sqlalchemy.engine.base.Engine, basetable: str, existing_to_lower: bool = False, rename: Optional[Dict[str, str]] = None, userobj: Optional[Any] = None, enforce_same_n_rows_as_base: bool = True, insert_basetable_columns: bool = True) None[source]
Parameters
  • viewname – name of the view

  • engine – SQLAlchemy database Engine

  • basetable – name of the single base table that this view draws from

  • existing_to_lower – translate column names to lower case in the view?

  • rename – optional dictionary mapping from_name: to_name to translate column names in the view

  • userobj – optional object (e.g. argparse.Namespace, dictionary…), not used by this class, and purely to store information for others’ benefit

  • enforce_same_n_rows_as_base – ensure that the view produces the same number of rows as its base table?

  • insert_basetable_columns – start drafting the view by including all columns from the base table?

add_from(element: str) None[source]

Add an element to the FROM clause of the draft view’s SQL statement.

add_select(element: str) None[source]

Add an element to the SELECT clause of the the draft view’s SQL (meaning: add e.g. a result column).

add_where(element: str) None[source]

Add an element to the WHERE clause of the draft view’s SQL statement.

create_view(engine: sqlalchemy.engine.base.Engine) None[source]

Creates the view.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

If enforce_same_n_rows_as_base is set, check the number of rows returned matches the base table.

Parameters

engine – SQLAlchemy database Engine

drop_view(engine: sqlalchemy.engine.base.Engine) None[source]

Drops the view.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters

engine – SQLAlchemy database Engine

get_index_request_dict() Dict[str, List[str]][source]

Returns all our recorded index requests, as a dictionary mapping each table name to a list of column names to be indexed. See request_index().

get_lookup_tables() List[str][source]

Returns all lookup tables that we have recorded. See record_lookup_table().

get_sql() str[source]

Returns the view-creation SQL.

record_lookup_table(table: str) None[source]

Keep a record of a lookup table. The framework may wish to suppress these from a data dictionary later (e.g. create a view, suppress the messier raw data). See get_lookup_tables().

Parameters

table – table name

record_lookup_table_keyfield(table: str, keyfield: Union[str, Iterable[str]]) None[source]

Makes a note that a table is a lookup table, and its key field(s) should be indexed. See get_lookup_tables(), get_index_request_dict().

Parameters
  • table – table name

  • keyfield – field name, or iterable (e.g. list) of them

record_lookup_table_keyfields(table_keyfield_tuples: Iterable[Tuple[str, Union[str, Iterable[str]]]]) None[source]

Make a note of a whole set of lookup table / key field groups. See record_lookup_table_keyfield().

Parameters

table_keyfield_tuples – iterable (e.g. list) of tuples of the format tablename, keyfield. Each will be passed to record_lookup_table_keyfield().

request_index(table: str, column: str) None[source]

Note a request that a specific column be indexed. The framework can use the ViewMaker to keep a note of these requests, and then add index hints to a data dictionary if it wishes. See get_index_request_dict().

Parameters
  • table – table name

  • column – column name

class crate_anon.common.sql.WhereCondition(column_id: Optional[crate_anon.common.sql.ColumnId] = None, op: str = '', datatype: str = '', value_or_values: Optional[Any] = None, raw_sql: str = '', from_table_for_raw_sql: Optional[crate_anon.common.sql.TableId] = None)[source]

Ancillary class for building SQL WHERE expressions from our web forms.

The essence of it is WHERE column op value_or_values.

__init__(column_id: Optional[crate_anon.common.sql.ColumnId] = None, op: str = '', datatype: str = '', value_or_values: Optional[Any] = None, raw_sql: str = '', from_table_for_raw_sql: Optional[crate_anon.common.sql.TableId] = None) None[source]
Parameters
  • column_idColumnId for the column

  • op – operation (e.g. =, <, <=, etc.)

  • datatype – data type string that must match values in our querybuilder.js; see source code. We use this to know how to build SQL literal values. (Not terribly elegant, but it works; SQL injection isn’t a particular concern because we let our users run any SQL they want and ensure the connection is made read-only.)

  • value_or_valuesNone, single value, or list of values. Which is appropriate depends on the operation. For example, IS NULL takes no value; = takes one; IN takes many.

  • raw_sql – override any thinking we might wish to do, and just return this raw SQL

  • from_table_for_raw_sql – if we are using raw SQL, provide a TableId for the relevant table here

property column_id: crate_anon.common.sql.ColumnId

Returns the ColumnId provided at creation.

sql(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns the WHERE clause (without WHERE itself!) for our condition, in the specified SQL grammar. Some examples might be:

  • somecol = 3

  • othercol IN (6, 7, 8)

  • thirdcol IS NOT NULL

  • textcol LIKE '%paracetamol%'

  • MATCH (fulltextcol AGAINST 'paracetamol') (MySQL)

  • CONTAINS(fulltextcol, 'paracetamol') (SQL Server)

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

property table_id: crate_anon.common.sql.TableId

Returns a TableId:

  • for raw SQL, our from_table_for_raw_sql attribute

  • otherwise, the table ID extracted from our column_id attribute

table_str(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) str[source]

Returns the table identifier in the specified SQL grammar.

Parameters

grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

crate_anon.common.sql.add_columns(engine: sqlalchemy.engine.base.Engine, table: sqlalchemy.sql.schema.Table, columns: List[sqlalchemy.sql.schema.Column]) None[source]

Adds columns to a table.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • table – SQLAlchemy Table object

  • columns – SQLAlchemy Column objects to add to the table

Behaviour of different database systems:

This function therefore operates one at a time.

SQLAlchemy doesn’t provide a shortcut for this.

crate_anon.common.sql.add_indexes(engine: sqlalchemy.engine.base.Engine, table: sqlalchemy.sql.schema.Table, index_info_list: Iterable[crate_anon.common.sql.IndexCreationInfo]) None[source]

Adds indexes to a table.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • table – SQLAlchemy Table object

  • index_info_list – Index(es) to create: list of IndexCreationInfo objects.

crate_anon.common.sql.assert_view_has_same_num_rows(engine: sqlalchemy.engine.base.Engine, basetable: str, viewname: str) None[source]

Ensures that a view gives the same number of rows as a table. (For use in situations where this should hold; views don’t have to do this in general!)

Parameters
  • engine – SQLAlchemy database Engine

  • basetable – name of the table that this view should have a 1:1 relationship to

  • viewname – view name

Raises

AssertionError

crate_anon.common.sql.coltype_length_if_text(column_type: str, dialect: str) Optional[int][source]

Find the length of an sql text column type.

Parameters
  • column_type – SQL column type as a string, e.g. "VARCHAR(50)"

  • dialect – the sql dialect the column type is from

Returns

length of the column or None if it’s not a text column.

crate_anon.common.sql.columns_to_table_column_hierarchy(columns: List[crate_anon.common.sql.ColumnId], sort: bool = True) List[Tuple[crate_anon.common.sql.TableId, List[crate_anon.common.sql.ColumnId]]][source]

Converts a list of column IDs :param columns: list of ColumnId objects :param sort: sort by table, and column within table?

Returns

a list of tuples, each table, columns, where table is a TableId and columns is a list of ColumnId

crate_anon.common.sql.create_view(engine: sqlalchemy.engine.base.Engine, viewname: str, select_sql: str) None[source]

Creates a view.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • viewname – view name

  • select_sql – SQL SELECT statement for this view

crate_anon.common.sql.drop_columns(engine: sqlalchemy.engine.base.Engine, table: sqlalchemy.sql.schema.Table, column_names: Iterable[str]) None[source]

Drops columns from a table.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • table – SQLAlchemy Table object

  • column_names – names of columns to drop

Columns are dropped one by one.

crate_anon.common.sql.drop_indexes(engine: sqlalchemy.engine.base.Engine, table: sqlalchemy.sql.schema.Table, index_names: Iterable[str]) None[source]

Drops indexes from a table.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • table – SQLAlchemy Table object

  • index_names – names of indexes to drop

crate_anon.common.sql.drop_view(engine: sqlalchemy.engine.base.Engine, viewname: str, quiet: bool = False) None[source]

Drops a view.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • viewname – view name

  • quiet – don’t announce this to the Python log

crate_anon.common.sql.dumb_make_identifier(database: Optional[str] = None, schema: Optional[str] = None, table: Optional[str] = None, column: Optional[str] = None) str[source]

Makes an SQL-style identifier by joining all the parts with ., without bothering to quote them.

Parameters
  • database – database name

  • schema – schema name

  • table – table name

  • column – column name

Returns

a string as above in the order “database, schema, table, column”, but omitting any missing parts

crate_anon.common.sql.ensure_columns_present(engine: sqlalchemy.engine.base.Engine, tablename: str, column_names: Iterable[str]) None[source]

Ensure all these columns are present in a table, or raise an exception.

Operates in case-insensitive fashion.

Parameters
  • engine – SQLAlchemy database Engine

  • tablename – name of the table

  • column_names – names of required columns

Raises

ValueError

crate_anon.common.sql.escape_percent_for_python_dbapi(sql: str) str[source]

Escapes % by converting it to %%. Use this for SQL within Python where % characters are used for argument placeholders.

crate_anon.common.sql.escape_percent_in_literal(sql: str) str[source]

Escapes % by converting it to \%. Use this for LIKE clauses.

crate_anon.common.sql.escape_quote_in_literal(s: str) str[source]

Escape '. We could use '' or \'. Let’s use \. for consistency with percent escaping.

crate_anon.common.sql.escape_sql_string_literal(s: str) str[source]

Escapes SQL string literal fragments against quotes and parameter substitution.

crate_anon.common.sql.escape_sql_string_or_int_literal(s: Union[str, int]) str[source]

Converts an integer or a string into an SQL literal (with single quotes and escaping in the case of a string).

crate_anon.common.sql.execute(engine: sqlalchemy.engine.base.Engine, sql: str) None[source]

Executes SQL.

Whether we act or just print is conditional on previous calls to set_print_not_execute().

Parameters
  • engine – SQLAlchemy database Engine

  • sql – raw SQL to execute (or print)

crate_anon.common.sql.format_sql_for_print(sql: str) str[source]

Very simple SQL formatting.

Remove blank lines and trailing spaces from an SQL statement. Converts tabs to spaces.

crate_anon.common.sql.get_column_names(engine: sqlalchemy.engine.base.Engine, tablename: str, to_lower: bool = False, sort: bool = False) List[str][source]

Reads columns names afresh from the database, for a specific table (in case metadata is out of date).

Parameters
  • engine – SQLAlchemy database Engine

  • tablename – name of the table

  • to_lower – convert view names to lower case?

  • sort – sort view names?

Returns

list of column names

crate_anon.common.sql.get_first_from_table(parsed: pyparsing.ParseResults, match_db: str = '', match_schema: str = '', match_table: str = '') crate_anon.common.sql.TableId[source]

Given a set of parsed results from a SELECT statement, returns the db, schema, table tuple representing the first table in the FROM clause.

Optionally, the match may be constrained with the match* parameters.

Parameters
  • parsed – a pyparsing.ParseResults result

  • match_db – optional database name to constrain the result to

  • match_schema – optional schema name to constrain the result to

  • match_table – optional table name to constrain the result to

Returns

a TableId, which will be empty in case of failure

crate_anon.common.sql.get_index_names(engine: sqlalchemy.engine.base.Engine, tablename: str, to_lower: bool = False, sort: bool = False) List[str][source]

Reads index names from the database, for a specific table.

Parameters
  • engine – SQLAlchemy database Engine

  • tablename – name of the table

  • to_lower – convert index names to lower case?

  • sort – sort index names?

Returns

list of index names

crate_anon.common.sql.get_table_names(engine: sqlalchemy.engine.base.Engine, to_lower: bool = False, sort: bool = False) List[str][source]

Returns all table names for the database.

Parameters
  • engine – SQLAlchemy database Engine

  • to_lower – convert table names to lower case?

  • sort – sort table names?

Returns

list of table names

crate_anon.common.sql.get_view_names(engine: sqlalchemy.engine.base.Engine, to_lower: bool = False, sort: bool = False) List[str][source]

Returns all view names for the database.

Parameters
  • engine – SQLAlchemy database Engine

  • to_lower – convert view names to lower case?

  • sort – sort view names?

Returns

list of view names

crate_anon.common.sql.is_sql_column_type_textual(column_type: str, min_length: int = 1) bool[source]

Does an SQL column type look textual?

Parameters
  • column_type – SQL column type as a string, e.g. "VARCHAR(50)"

  • min_length – what’s the minimum string length we’ll say “yes” to?

Returns

is it a textual column (of the minimum length or more)?

Note:

  • For SQL Server’s NVARCHAR(MAX), crate_anon.crateweb.research.research_db_info._schema_query_microsoft() returns “NVARCHAR(-1)”

crate_anon.common.sql.make_identifier(grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar, database: Optional[str] = None, schema: Optional[str] = None, table: Optional[str] = None, column: Optional[str] = None) str[source]

Makes an SQL identifier by quoting its elements according to the style of the specific SQL grammar, and then joining them with ..

Parameters
  • grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

  • database – database name

  • schema – schema name

  • table – table name

  • column – column name

Returns

a string as above in the order “database, schema, table, column”, but omitting any missing parts

crate_anon.common.sql.make_string_literal(s: str) str[source]

Converts a Python string into an SQL single-quoted (and escaped) string literal.

crate_anon.common.sql.matches_fielddef(table: str, field: str, fielddef: Union[str, List[str]]) bool[source]

Does the table/field name match the wildcard-based field definition?

Parameters
  • table – table name

  • field – fieldname

  • fielddeffnmatch-style pattern (e.g. "system_table.*" or "*.nhs_number"), or list of them

crate_anon.common.sql.matches_tabledef(table: str, tabledef: Union[str, List[str]]) bool[source]

Does the table name match the wildcard-based table definition?

Parameters
  • table – table name

  • tabledeffnmatch-style pattern (e.g. "patient_address_table_*"), or list of them

crate_anon.common.sql.parser_add_from_tables(parsed: pyparsing.ParseResults, join_info_list: List[crate_anon.common.sql.JoinInfo], grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) pyparsing.ParseResults[source]

Takes a parsed SQL statement of the form

SELECT a, b, c
FROM sometable
WHERE conditions;

and adds one or more join columns, e.g. JoinInfo("othertable", "INNER JOIN", "ON table.key = othertable.key"), to give

SELECT a, b, c
FROM sometable
INNER JOIN othertable ON table.key = othertable.key
WHERE conditions;

Presupposes that there at least one table already in the FROM clause.

Parameters
  • parsed – a pyparsing.ParseResults result

  • join_info_list – list of JoinInfo objects

  • grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

Returns

a pyparsing.ParseResults result

crate_anon.common.sql.parser_add_result_column(parsed: pyparsing.ParseResults, column: str, grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar) pyparsing.ParseResults[source]

Takes a parsed SQL statement of the form

SELECT a, b, c
FROM sometable
WHERE conditions;

and adds a result column, e.g. d, to give

SELECT a, b, c, d
FROM sometable
WHERE conditions;

Presupposes that there is at least one column already in the SELECT statement.

Parameters
  • parsed – a pyparsing.ParseResults result

  • column – column name

  • grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

Returns

a pyparsing.ParseResults result

crate_anon.common.sql.set_distinct(sql: str, grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar, action: str = 'set', formatted: bool = True, debug: bool = False, debug_verbose: bool = False) str[source]

Takes an SQL statement (as a string) and modifies its DISTINCT status.

Parameters
  • sql – SQL statment as text

  • grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

  • action – one of "set", "clear", "toggle"; see set_distinct_within_parsed()

  • formatted – pretty-format the result?

  • debug – show debugging information to the Python log

  • debug_verbose – be verbose when debugging

Returns

the modified SQL statment, as a string

crate_anon.common.sql.set_distinct_within_parsed(p: pyparsing.ParseResults, action: str = 'set') None[source]

Modifies (in place) the DISTINCT status of a parsed SQL statement.

Parameters
  • p – a pyparsing.ParseResults result

  • action"set" to turn DISTINCT on; "clear" to turn it off; or "toggle" to toggle it.

crate_anon.common.sql.set_print_not_execute(print_not_execute: bool) None[source]

Sets a nasty global flag: should we print DDL, rather than executing it, when we issue DDL commands from this module?

Parameters

print_not_execute – print (not execute)?

crate_anon.common.sql.split_db_schema_table(db_schema_table: str) crate_anon.common.sql.TableId[source]

Converts a simple SQL-style identifier string into a TableId.

Parameters

db_schema_table – one of: database.schema.table, schema.table, table

Returns

a TableId

Raises

ValueError

crate_anon.common.sql.split_db_schema_table_column(db_schema_table_col: str) crate_anon.common.sql.ColumnId[source]

Converts a simple SQL-style identifier string into a ColumnId.

Parameters

db_schema_table_col – one of: database.schema.table.column, schema.table.column, table.column, column

Returns

a ColumnId

Raises

ValueError

crate_anon.common.sql.sql_fragment_cast_to_int(expr: str, big: bool = True, dialect: Optional[sqlalchemy.engine.interfaces.Dialect] = None, viewmaker: Optional[crate_anon.common.sql.ViewMaker] = None) str[source]

Takes an SQL expression and coerces it to an integer. For Microsoft SQL Server.

Parameters
  • expr – starting SQL expression

  • big – use BIGINT, not INTEGER?

  • dialect – optional sqlalchemy.engine.interfaces.Dialect. If None and we have a viewmaker, use the viewmaker’s dialect. Otherwise, assume SQL Server.

  • viewmaker – optional ViewMaker

Returns

modified SQL expression

Notes

Conversion to INT:

Note that the regex-like expression supported by LIKE is extremely limited.

SQL Server does not support a REGEXP command directly.

So the best bet is to have the LIKE clause check for a non-integer:

CASE
    WHEN something LIKE '%[^0-9]%' THEN NULL
    ELSE CAST(something AS BIGINT)
END

… which doesn’t deal with spaces properly, but there you go. Could also strip whitespace left/right:

CASE
    WHEN LTRIM(RTRIM(something)) LIKE '%[^0-9]%' THEN NULL
    ELSE CAST(something AS BIGINT)
END

That only works for positive integers.

LTRIM/RTRIM are not ANSI SQL. Nor are unusual LIKE clauses; see https://stackoverflow.com/questions/712580/list-of-special-characters-for-sql-like-clause

The other, for SQL Server 2012 or higher, is TRY_CAST:

TRY_CAST(something AS BIGINT)

… which returns NULL upon failure; see https://msdn.microsoft.com/en-us/library/hh974669.aspx

Therefore, our method is as follows:

  • If the database supports TRY_CAST, use that.

  • Otherwise if we’re using SQL Server, use a CASE/CAST construct.

  • Otherwise, raise ValueError as we don’t know what to do.

crate_anon.common.sql.toggle_distinct(sql: str, grammar: cardinal_pythonlib.sql.sql_grammar.SqlGrammar, formatted: bool = True, debug: bool = False, debug_verbose: bool = False) str[source]

Takes an SQL statement and toggles its DISTINCT status.

Parameters
  • sql – SQL statment as text

  • grammarcardinal_pythonlib.sql.sql_grammar.SqlGrammar

  • formatted – pretty-format the result?

  • debug – show debugging information to the Python log

  • debug_verbose – be verbose when debugging

Returns

the modified SQL statment, as a string

crate_anon.common.sql.translate_sql_qmark_to_percent(sql: str) str[source]

This function translates SQL using ? placeholders to SQL using %s placeholders, without breaking literal '?' or '%', e.g. inside string literals.

Notes

I prefer ?, because % is used in LIKE clauses, and the databases we’re using like it.

So:

  • We use %s when using cursor.execute() directly, via Django.

  • We use ? when talking to users, and cardinal_pythonlib.sql.sql_grammar.SqlGrammar objects, so that the visual appearance matches what they expect from their database.