Skip to content
View james-westwood's full-sized avatar

Sponsoring

@Mastermindzh

Block or report james-westwood

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
james-westwood/README.md

Senior Data Scientist/Engineer building production reproducible analytical piplines (RAP), ML/GenAI systems at the Food Standards Agency. Previously led open-source data engineering at the Office for National Statistics.

Current Work (Private Repos)

  • LangChain Agent - Intelligent data standardization for 360+ local authority data sources with extreme format variance
  • NLP Classification - DistilBERT transformer model (82% accuracy, 240-class classification) on Azure
  • Platform Engineering - Migrating enterprise data to Databricks Medallion architecture (Azure/Databricks)
  • ML Production Systems - Full lifecycle deployment, monitoring, MLOps best practices

Tech Stack

ML/GenAI: LangChain • Transformers (DistilBERT, FastText) • Scikit-learn • PyTorch • MLFlow
Data Engineering: Databricks • PySpark • Apache Spark • Python • SQL
Cloud & DevOps: Azure • GCP • GitHub Actions • CI/CD
Databases: PostgreSQL • BigQuery • DuckDB

Featured Open Source Work

Unfortunately the vast majority of my work is closed-source but I pioneered the use of an open-from-the-start that adheres to government guidelines.

Production system calculating UK national accounts R&D expenditure statistics. Pioneered open-source approach within ONS, establishing pattern for government data science transparency.

Scale & Impact:

  • 4,680+ commits across 20 contributors
  • 94 production releases serving national statistics
  • Comprehensive CI/CD pipeline, testing (61% coverage), with excellent technical and non-technical user documentation
  • Set precedent for open-source government data projects at ONS

Role: Technical lead and open-source advocate - negotiated stakeholder approval for public release from project inception.

🎯 Professional Focus

Sustainability • Food Systems • Production ML systems • Data engineering best practices • Practical AI applications • Open government software

Popular repositories Loading

  1. julia-algorithms julia-algorithms Public

    Julia 1

  2. optimal_laundry optimal_laundry Public

    A program to check if weather conditions are good enough to dry your laundry outdoors

    Python

  3. govt_pesticide_test_data_downloader govt_pesticide_test_data_downloader Public

    Jupyter Notebook 4

  4. sdg-csv-data-filler sdg-csv-data-filler Public

    Forked from ONSdigital/sdg-csv-data-filler

    Jupyter Notebook

  5. labrpt labrpt Public

    Generate reports from lab data in CSV format.

    Python

  6. tensor_flow tensor_flow Public

    Tensor Flow Sections from FCC ML Course

    Jupyter Notebook