Skip to content

Commit a860501

Browse files
committed
Working on library cleaning dataset post
1 parent 779c037 commit a860501

File tree

2 files changed

+111
-35
lines changed

2 files changed

+111
-35
lines changed

_posts/2024-10-28-public-library-websites.md

Lines changed: 20 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -8,64 +8,49 @@ tags:
88
- Design
99
published: false
1010
---
11-
I first started working on local government websites in 2010, at Reading Borough Council. Whilst there, the crumbling library website content got pulled back in to the main Council site. Not the online catalogue - that still tends to be be separate. But the pages that provide information about fines and charges, opening hours, events, etc.
11+
I first started working on local government websites in 2011, at a Borough Council. Whilst there, the content from the crumbling library website got pulled back in to the main Council site. Not the online catalogue - that still tends to be separate. But the pages providing information about fines and charges, opening hours, events, etc.
1212

13-
It's a rotating trend in local government to have library content as a separate entity, and then part of the Council site, then back again. The process goes something like:
13+
It's a rotating trend in local government to have library content as a separate site, and then part of the Council site, then back again. The process goes something like:
1414

15-
- Library service is disenchanted with the Council site
16-
- Library service gets granted bit of money and spins out a new site
17-
- The library website stagnates and becomes a danger to itself and others
15+
- Library service is disenchanted with the Council site and would like more control
16+
- Library service gets granted some money and spins out a new site
17+
- The library website then stagnates, becomes out of date, and a danger to itself and others
1818
- Library service content gets pulled back in to the Council site
1919

2020
And round again. It's a simplification, but it's not far off.
2121

22-
I'm fortunate to have worked closely with a few local government website content and development teams, alongside many others in the private sector. It's easy to say that as a sector, government are unsurpassed in capability. You only need to look at the [GOV.UK design system](https://design-system.service.gov.uk/) to see that. But not just design: content, accessibility, security, performance, user research, the service design, APIs, standards, etc.
22+
Why, firstly, do libraries feel the need for a site outside of the Council site? Being able to control the content is one reason. Centrally managed websites don't allow just anyone to edit content. That could mean that updating information, news stories, etc need to go via a central content team, introducing friction to the process. The attitude towards council digital services generally seems to be a negative one, reports like Libraries Connected's **Libraries in Lockdown: Connecting communities in crisis** are full of examples of dynamic library services being hindered by their corporate organisations.
2323

24-
The question is then I suppose, why are library services so frequently. Maybe it's just that natural desire . Maybe that library
24+
> Councils need to remove barriers which prevent libraries from delivering a high-quality digital offer including corporate limitations on web platforms and use of social media
25+
> ***Libraries in Lockdown: Connecting communities in crisis**
2526
26-
I'd argue that one aspect though is library services simply aren't informed enough about what it takes to create and maintain a website, the things. And uninformed organisations are often disatisfied by the service they receive, especially when they don't quite understand what it is.
27+
You will struggle to find reports highlighting the benefit that libraries get from being part of organisations that have professional digital services.
2728

29+
It's easy to be in two minds about this. I'm fortunate to have worked closely with a few local government website content and development teams, as well as experience in the private sector. I can easily say that as a sector, government web teams are unsurpassed in capability. You only need to look at the [GOV.UK design system](https://design-system.service.gov.uk/) to see that. And not just design: content, accessibility, security, hosting, performance, user research, the service design, APIs, standards, etc. I'm sure there are differences in the quality of different Councils, but being part of a Council website should be something where everyone recognises and appreciates the benefit.
2830

31+
Conversely, you can't really say the same about library services. Exceptional in providing library services, but not websites. So it's fair to view the idea of creating a new library website with some reservations.
2932

30-
Do I have any evidence for this? Not much to hand, but I quickly looked up a report on li
33+
But, why would a data blog primarily concerned with open data and creating digital prototypes be stifling the creativity of those services who do want to do their own thing? It's good for public library services to want to both have their own websites, and to make those sites an experiential place for users to go. But getting the basics right (those old 'transactional' stuff) is also really hard and the most important. Public websites are a lot of work, a lot of expertise, and a lot of money.
3134

32-
It's good for public library services to want to both have their own websites, and to make it an experiential place for users to go. But getting the basics right (those old 'transactional' stuff) is also really hard and the most important. Public websites are a lot of work, a lot of expertise, and a lot of money.
3335

34-
My last public sector web role was at Bristol City Council, in the digital transformation team, which included a few teams essentially creating and maintaining the website content, day to day running, and digital developments. I don't know the wage bill for the entire team, or the infrastructure and licensing costs, but it would probably be in the region of £1m per year. That's a lot of money, but it's a lot of website and a lot of expertise.
3536

36-
CHecklist for a new website:
37-
38-
- Accessibility. You will need to do an original audit of the website, and then regular checks. For a library website that includes content as well as interactive experiences and possibly catalogue integration, you should budget in the region of £10,000. You will also need someone who is able to handle ongoing accessibility queries and be able to make technical updates to the website in response to these.
3937

4038
Does your library website:
4139

40+
- have a contact email for accessibility queries?
4241
- have a regular penetration test?
4342
- have a regular accessibility audit?
4443
- receive regular software updates?
45-
- get security patches applied?
46-
- have a support contract?
47-
48-
I'm slightly sceptical of library services that want to go it alone with their own website. It doesn't mean discouraging it, but again, be aware of the investment and skills required.
49-
50-
There is no right or wrong model of digital autonomy for library services, but it's important to understand all the implications. How many people does a local Council employ who are experts on information governance for example? Maybe a small team of 2 or 3. Important expertise to have in a local authority, but stretched to have any true oversight of individual services.
51-
52-
With a spun-out librry service how many people are there? In reality none. However, the person tasked with information governance for the service potentially has
53-
54-
55-
56-
- Does is have an accessibility statement? This should be a relatively easy check. An accessibility statement is a legal requirement and should be
57-
- How efficient is the website? For this I'll check the rating given by the
58-
59-
### Leeds libraries
60-
61-
62-
63-
### London libraries
44+
- have a support contract and agreed uptime?
45+
- receive regular platform security patches?
46+
- part of the council security monitoring for new vulnerabilities?
47+
- have content review processes?
6448

6549

50+
A lot of these things are the reason Council website can seem like big 'corporate'
6651

67-
### Gloucestershire libraries
6852

53+
Money
6954

7055

71-
This is a blog post without any definitive answers or conclusions. But there are gaps in current thinking around library websites that need to be adresssed.
56+
My last public sector web role was at Bristol City Council, in the digital transformation team, which included a few teams essentially creating and maintaining the website content, day to day running, and digital developments. I don't know the wage bill for the entire team, or the infrastructure and licensing costs, but it would probably be in the region of £1m per year. That's a lot of money, but it's a lot of website and a lot of expertise.
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
---
2+
title: Cleaning the basic libraries dataset
3+
excerpt: Enhancing data on library locations
4+
categories:
5+
- Data
6+
tags:
7+
- Data
8+
published: false
9+
---
10+
A recent public library dataset that has been getting some. This dataset:
11+
12+
-
13+
- Is used, at least in checking for updated data, in the [LibraryOn library finder]
14+
15+
It really is the most basic of basic data - the locations of our libraries - but it has been a challenge. In around 2012
16+
17+
This dataset started in 2014. The newly formed DCMS Libraries Taskforce found they were visiting libraries around the country and documentating their travels ([and a photo collection](). As there wasn't much data to begin with this seemed like the beginning of a dataset of libraries
18+
19+
Fast forward to 2016 and I spent afternoon in a meeting room at the DCMS discussing the fields that would formally be collected in a spreadsheet .
20+
21+
22+
23+
24+
- Trimmed whitespace either end of all data entries
25+
- Corrected mismatches between the 'Reporting Service' and 'Upper Tier Local Authority'. On a few occasions these are legitimately different, but generally not.
26+
- Suffolk reported that the Prison Library HMP Bure was in Norwich upper tier local authority. The authority should be Norfolk, but it is correct that Suffolk libraries operate the prison library, and are therefore the reporting service.
27+
- Westminster reported that Paddington Children's Library is in London UTLA - this should just have just been set to Westminster.
28+
- Standardised 10 of the names used in the 'Reporting service' column to easier match them to unique identifiers
29+
- Standardised 10 of the names used in the 'Upper tier local authority' column to easier match these to unique identifiers.
30+
- Cleared non-postcode text from the postcode column e.g. 'No registered public address'
31+
- Ensured the closed field has an entry for libraries that have otherwise been set to closed
32+
- Updated postcode entries to be uppercase
33+
- Updated invalid postcodes from closed libraries
34+
- Updated invalid postcodes from open libraries
35+
- Update valid but incorrect postcodes
36+
- Remove the leading zeros from unique property reference numbers. These are commonly how UPRNs are stored, but not necessary.
37+
- Remove UPRNs that are not numbers
38+
- Remove UPRNs that are over 5 miles away from the postcode location (and very likely wrong)
39+
- Standardise the Type column to go from 10 to 5 distinct variations
40+
- Remove entries that didn't seem to fit with any definitions
41+
- Ensured statutory fields are Yes or No
42+
- Ensured closed year is set for entries that have closed in the operation field
43+
- Ensured operation fields are one of 'LA', 'LAU', 'C', 'CR', 'ICL' or empty
44+
- Ensured that if the closed year was completed it was a 4-digit year
45+
- Cleared unnecessary text from the operating organisation column (e.g. 'N/A')
46+
- Standardised the 'No' entry for the new build question
47+
- Standardised the 'No' entry for the co-located question
48+
- Ensuring that the indicator under each co-located column is only ever set to 'X'
49+
- Standardised the opening times fields to only the 19 possible entries as documented in the ACE guidance
50+
- Ensuring the hours and staffed hours fields are numeric only
51+
- Standardised the 'No' entry for the automated system question
52+
53+
54+
Coordinates
55+
56+
There are also a few area where
57+
58+
There are no coordinates in the data - although there are address fields, postcodes, and unique property reference numbers.
59+
60+
There are two open data source from Ordnance Survey that can help here:
61+
62+
- [ONS Postcode Directory]()
63+
- Open UPRN - The exact coordinates for
64+
65+
I have added 4 columns
66+
67+
| Column name | Description |
68+
| ----------- | ----------- |
69+
| Easting | |
70+
| Northing | |
71+
| Longitude | |
72+
| Latitude | |
73+
74+
This does change the data attribution statement. The licence remains Open (the Open Government Licence), but requires the two attribution statements:
75+
76+
77+
78+
79+
Geographic Intelligence
80+
81+
This sounds a bit fancy but really 'interesting stuff about the area' is about it. Having a properly defined location for things gives us so much additional information: the population of the area, how rural/urban it is, deprivation levels, etc. There is too much to include in one dataset but a few key ones would be useful. I've added the following:
82+
83+
84+
| Column | Description |
85+
| ------ | ----------- |
86+
| | |
87+
88+
89+
Note - these are directly taken from the [ONS Postcode Directory](https://geoportal.statistics.gov.uk/datasets/265778cd85754b7e97f404a1c63aea04) by simple postcode lookup. Because they are postcodes and inexact locations, these are 'best-fit' lookups. Using the UPRN coordinates would be more accurate but I couldn't really be bothered. Plus we don't have half the UPRNs anyway.
90+
91+
Enjoy!

0 commit comments

Comments
 (0)