14.7.4. crate_anon.preprocess.postcodes

crate_anon/preprocess/postcodes.py


Copyright (C) 2015, University of Cambridge, Department of Psychiatry. Created by Rudolf Cardinal (rnc1001@cam.ac.uk).

This file is part of CRATE.

CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRATE. If not, see <https://www.gnu.org/licenses/>.


Fetches UK postcode information and creates a database.

Code-Point Open, CSV, GB

Office for National Statistics Postcode Database (ONSPD):

Background:

class crate_anon.preprocess.postcodes.BUA(**kwargs)[source]

Represents England & Wales 2013 build-up area (BUA) codes/names.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.BUASD(**kwargs)[source]

Represents built-up area subdivisions (BUASD) in England & Wales 2013.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.CASWard(**kwargs)[source]

Represents censua area statistics (CAS) wards in the UK, 2003.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.CCG(**kwargs)[source]

Represents clinical commissioning groups (CCGs), UK 2019.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.Country(**kwargs)[source]

Represents UK countries, 2012.

This is not a long table.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.County2019(**kwargs)[source]

Represents counties, UK 2019.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.EER(**kwargs)[source]

Represents European electoral regions (EERs), UK 2010.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.ExtendedBase[source]

Mixin to extend the SQLAlchemy ORM Base class by specifying table creation parameters (specifically, for MySQL, to set the character set and MySQL engine).

Only used in the creation of Base; everything else then inherits from Base as usual.

See https://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/mixins.html

class crate_anon.preprocess.postcodes.GOR(**kwargs)[source]

Represents Government Office Regions (GORs), England 2010.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.GenericLookupClassMeta(name, bases, namespace, **kwargs)[source]

To avoid: “TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases”.

We want a class that’s a subclass of Base and ABC. So we can work out their metaclasses:

from abc import ABC
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql.schema import MetaData

class ExtendedBase:
    __table_args__ = {'mysql_charset': 'utf8', 'mysql_engine': 'InnoDB'}

metadata = MetaData()
Base = declarative_base(metadata=metadata, cls=ExtendedBase)

type(Base)  # metaclass of Base: <class: 'sqlalchemy.ext.declarative.api.DeclarativeMeta'>
type(ABC)  # metaclass of ABC: <class 'abc.ABCMeta'>

and thus define this class to inherit from those two metaclasses, so it can be the metaclass we want.

class crate_anon.preprocess.postcodes.GenericLookupClassType(**kwargs)[source]

Type hint for our various simple lookup classes.

Alternatives that don’t work: Type[Base], Type[BASETYPE], type(Base).

class crate_anon.preprocess.postcodes.IMDLookupEN(**kwargs)[source]

Represents the Index of Multiple Deprivation (IMD), England 2015.

This is quite an important one to us! IMDs are mapped to LSOAs; see e.g. LSOAEW2011.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.IMDLookupSC(**kwargs)[source]

Represents the Index of Multiple Deprivation (IMD), Scotland 2016.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.IMDLookupWA(**kwargs)[source]

Represents the Index of Multiple Deprivation (IMD), Wales 2014.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.LAD(**kwargs)[source]

Represents local authority districts (LADs), UK 2019.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.LAU(**kwargs)[source]

Represents European Union Local Administrative Units (LAUs), UK 2019.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.LEP(**kwargs)[source]

Represents Local Enterprise Partnerships (LEPs), England 2017.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.LSOA2011(**kwargs)[source]

Represents lower layer super output area (LSOAs), UK 2011.

This is quite an important one. LSOAs map to IMDs; see IMDLookupEN.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.MSOA2011(**kwargs)[source]

Represents middle layer super output areas (MSOAs), UK 2011.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.NationalPark(**kwargs)[source]

Represents national parks, Great Britain 2016.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.OAClassification(**kwargs)[source]

Represents 2011 Census Output Area (OA) classification names/codes.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.PCT2019(**kwargs)[source]

Represents Primary Care Trust (PCT) organizations, UK 2019.

The forerunner of CCGs (q.v.).

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.PFA(**kwargs)[source]

Represents police force areas (PFAs), Great Britain 2015.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.Parish(**kwargs)[source]

Represents parishes, England & Wales 2014.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.Postcode(**kwargs)[source]

Maps individual postcodes to… lots of things. Large table.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.SSR(**kwargs)[source]

Represents Standard Statistical Regions (SSRs), UK 2005.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.TTWA(**kwargs)[source]

Represents travel-to-work area (TTWAs), UK 2011.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.Ward2019(**kwargs)[source]

Represents electoral wards, UK 2016.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

class crate_anon.preprocess.postcodes.WestminsterConstituency(**kwargs)[source]

Represents Westminster parliamentary constituencies, UK 2014.

__init__(**kwargs)

A simple constructor that allows initialization from kwargs.

Sets attributes on the constructed instance using the names and values in kwargs.

Only keys that are present as attributes of the instance’s class are allowed. These could be, for example, any mapped columns or relationships.

crate_anon.preprocess.postcodes.commit_and_announce(session: Session) None[source]

Commits an SQLAlchemy ORM session and says so.

crate_anon.preprocess.postcodes.convert_date(d: Dict[str, Any], key: str) None[source]

Modifies d[key], if it exists, to convert it to a datetime.datetime or None.

Parameters:
  • d – dictionary

  • key – key

crate_anon.preprocess.postcodes.convert_float(d: Dict[str, Any], key: str) None[source]

Modifies d[key], if it exists, to convert it to a float or None.

Parameters:
  • d – dictionary

  • key – key

crate_anon.preprocess.postcodes.convert_int(d: Dict[str, Any], key: str) None[source]

Modifies d[key], if it exists, to convert it to an int or None.

Parameters:
  • d – dictionary

  • key – key

crate_anon.preprocess.postcodes.main() None[source]

Command-line entry point. See command-line help.

crate_anon.preprocess.postcodes.populate_generic_lookup_table(sa_class: GenericLookupClassType, datadir: str, session: Session, replace: bool = False, commit: bool = True, commitevery: int = 10000) None[source]

Populates one of many generic lookup tables with ONSPD data.

We find the data filename from the __filename__ property of the specific class, hunting for it within datadir and its subdirectories.

The .TXT files look at first glance like tab-separated values files, but in some cases have inconsistent numbers of tabs (e.g. “2011 Census Output Area Classification Names and Codes UK.txt”). So we’ll use the .XLSX files.

If the headings parameter is passed, those headings are used. Otherwise, the first row is used for headings.

Parameters:
  • sa_class – SQLAlchemy ORM class

  • datadir – root directory of ONSPD data

  • session – SQLAlchemy ORM database session

  • replace – replace tables even if they exist? (Otherwise, skip existing tables.)

  • commit – COMMIT the session once we’ve inserted the data?

  • commitevery – if committing: commit every n rows inserted

crate_anon.preprocess.postcodes.populate_postcode_table(filename: str, session: Session, replace: bool = False, startswith: List[str] | None = None, reportevery: int = 1000, commit: bool = True, commitevery: int = 10000) None[source]

Populates the Postcode table, which is very big, from Office of National Statistics Postcode Database (ONSPD) database that you have downloaded.

Parameters:
  • filename – CSV file to read

  • session – SQLAlchemy ORM database session

  • replace – replace tables even if they exist? (Otherwise, skip existing tables.)

  • startswith – if specified, restrict to postcodes that start with one of these strings

  • reportevery – report to the Python log every n rows

  • commit – COMMIT the session once we’ve inserted the data?

  • commitevery – if committing: commit every n rows inserted

crate_anon.preprocess.postcodes.show_docs() None[source]

Print the column doc attributes from the Postcode class, in tabular form, to stdout.

crate_anon.preprocess.postcodes.values_from_row(row: Iterable[Cell]) List[Any][source]

Returns all values from a spreadsheet row.

For the openpyxl interface to XLSX files.