Skip to content

Commit 064d82b

Browse files
freyfogleclaude
andcommitted
Reformat README.md for improved readability
- Fix heading hierarchy (h1 title, h2 sections, h3 subsections) - Convert implementations list and coverage stats to tables - Use fenced code blocks with syntax highlighting - Reorganize content into logical sections - Add alt text to logo image Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 2a99b24 commit 064d82b

File tree

1 file changed

+115
-114
lines changed

1 file changed

+115
-114
lines changed

README.md

Lines changed: 115 additions & 114 deletions
Original file line numberDiff line numberDiff line change
@@ -1,173 +1,174 @@
1-
# address formatting
2-
3-
### Overview
4-
5-
This project contains templates and test cases for address formats used in territories around the world. The templates can then be processed in any programming language ([see below for list of processors](#processing-logic)).
6-
7-
### Build Status
1+
# Address Formatting
82

93
[![Build Status](https://github.com/OpenCageData/address-formatting/actions/workflows/ci.yml/badge.svg)](https://github.com/OpenCageData/address-formatting/actions/workflows/ci.yml)
104

11-
### An example:
5+
Templates and test cases for address formats used in territories around the world. The templates can be processed in any programming language ([see list of processors](#processing-logic)).
126

13-
Given a set of address parts like
7+
## Example
148

15-
house_number: 17
16-
road: Rue du Médecin-Colonel Calbairac
17-
neighbourhood: Lafourguette
18-
suburb: Toulouse Ouest
19-
postcode: 31000
20-
city: Toulouse
21-
county: Toulouse
22-
state: Midi-Pyrénées
23-
country: France
24-
country_code: FR
9+
Given a set of address parts:
2510

26-
we want to write logic to compile an address in the format consumers expect
11+
```yaml
12+
house_number: 17
13+
road: Rue du Médecin-Colonel Calbairac
14+
neighbourhood: Lafourguette
15+
suburb: Toulouse Ouest
16+
postcode: 31000
17+
city: Toulouse
18+
county: Toulouse
19+
state: Midi-Pyrénées
20+
country: France
21+
country_code: FR
22+
```
2723
28-
17 Rue du Médecin-Colonel Calbairac
29-
31000 Toulouse
30-
France
24+
We want to compile an address in the format consumers expect:
3125
32-
### Why would you want to do this?
26+
```
27+
17 Rue du Médecin-Colonel Calbairac
28+
31000 Toulouse
29+
France
30+
```
3331

34-
The intended use case is database or geocoding systems (forward, reverse, autocomplete) where we know both the country of the address and the language of the user/reader. The address is displayed to a consumer (for example in an app) and not used to print on an envelope for actual postal delivery. We use it to format output from the [OpenCage Geocoding API](https://opencagedata.com/api).
32+
## Why Use This?
3533

36-
### Which addresses are we talking about?
34+
The intended use case is database or geocoding systems (forward, reverse, autocomplete) where we know both the country of the address and the language of the user/reader. The address is displayed to a consumer (for example in an app) and not used to print on an envelope for actual postal delivery. We use it to format output from the [OpenCage Geocoding API](https://opencagedata.com/api).
3735

38-
We have to deal with
36+
## Scope
3937

40-
* incomplete data
41-
* anything with a name (peaks, bridges, bus stops)
38+
**What we handle:**
39+
- Incomplete data
40+
- Anything with a name (peaks, bridges, bus stops)
4241

43-
Unlike [physical post (office) mail](http://www.bitboost.com/ref/international-address-formats.html) we don't have to deal with
42+
**What we don't handle** (unlike [physical postal mail](http://www.bitboost.com/ref/international-address-formats.html)):
43+
- Apartment/flat numbers, floor numbers
44+
- PO boxes
45+
- Translating the destination address language (whatever language is input is output)
4446

45-
* apartment/flat number, floor numbers
46-
* PO boxes
47-
* translating the language of the (destination) address. Whatever language is input is output.
48-
49-
### Processing logic
47+
## Processing Logic
5048

51-
Our goal with this repository is a series of (programming) language independent templates. Those templates can then be processed by whatever software you like.
49+
Our goal is a series of programming language-independent templates that can be processed by whatever software you like.
5250

53-
There are open-source implementations in
51+
### Open-Source Implementations
5452

55-
* [Android library](https://github.com/woheller69/AndroidAddressFormatter)
56-
* [Elixir](https://github.com/dkuku/ex_address_formatting)
57-
* [Go](https://github.com/timonmasberg/address-formatter)
58-
* [Java](https://github.com/placemarkt/address-formatter-java)
59-
* [Javascript](https://github.com/fragaria/address-formatter)
60-
* [Kotlin](https://github.com/bettermile/address-formatter-kotlin)
61-
* [Perl](https://metacpan.org/release/Geo-Address-Formatter)
62-
* [PHP](https://github.com/predicthq/address-formatter-php)
63-
* [PowerShell](https://github.com/GruberMarkus/AddressFormatter) cross-platform
64-
* [Python (no longer maintained)](https://github.com/pudo/addressformatting/tree/master)
65-
* [Ruby](https://github.com/mirubiri/address_composer)
66-
* [Rust (no longer maintained)](https://github.com/antoine-de/address-formatter-rs)
67-
* [Scala](https://github.com/ben-willis/address-formatter)
53+
| Language | Repository | Notes |
54+
|----------|------------|-------|
55+
| Android | [AndroidAddressFormatter](https://github.com/woheller69/AndroidAddressFormatter) | |
56+
| Elixir | [ex_address_formatting](https://github.com/dkuku/ex_address_formatting) | |
57+
| Go | [address-formatter](https://github.com/timonmasberg/address-formatter) | |
58+
| Java | [address-formatter-java](https://github.com/placemarkt/address-formatter-java) | |
59+
| JavaScript | [address-formatter](https://github.com/fragaria/address-formatter) | |
60+
| Kotlin | [address-formatter-kotlin](https://github.com/bettermile/address-formatter-kotlin) | |
61+
| Perl | [Geo-Address-Formatter](https://metacpan.org/release/Geo-Address-Formatter) | |
62+
| PHP | [address-formatter-php](https://github.com/predicthq/address-formatter-php) | |
63+
| PowerShell | [AddressFormatter](https://github.com/GruberMarkus/AddressFormatter) | Cross-platform |
64+
| Python | [addressformatting](https://github.com/pudo/addressformatting/tree/master) | No longer maintained |
65+
| Ruby | [address_composer](https://github.com/mirubiri/address_composer) | |
66+
| Rust | [address-formatter-rs](https://github.com/antoine-de/address-formatter-rs) | No longer maintained |
67+
| Scala | [address-formatter](https://github.com/ben-willis/address-formatter) | |
6868

69-
We would love more language implementations. The more people who use the templates, the more likely bugs will be reported.
69+
We welcome more language implementations. The more people who use the templates, the more likely bugs will be reported.
7070

71-
If you write a processor, please submit a pull request adding your processor to the list.
71+
**If you write a processor**, please submit a pull request adding it to the list. Include this repo as a [git submodule](https://git-scm.com/book/en/v2/Git-Tools-Submodules) so we all use the same templates/configuration and stay in sync. See [how we do it in the Perl parser](https://github.com/OpenCageData/perl-Geo-Address-Formatter/blob/master/README.md#installation) for an example.
7272

73-
One key point: please include this repo as a [git submodule](https://git-scm.com/book/en/v2/Git-Tools-Submodules), so we all use the same templates/configuration and don't get out of sync. if you are unfamiliar with git submodules, please have a look at [how we do it in the Perl parser](https://github.com/OpenCageData/perl-Geo-Address-Formatter/blob/master/README.md#installation).
73+
## International Coverage
7474

75-
Thanks!
75+
As of March 2024:
7676

77-
### International coverage
77+
| Metric | Count |
78+
|--------|-------|
79+
| Known territories | 251 |
80+
| Territories with tests | 251 (100%) |
81+
| Territories with rules | 251 (100%) |
82+
| Territories without rules or tests | 0 (0%) |
7883

79-
As of March 2024 coverage is:
84+
This output is generated by `bin/coverage.pl`. Run `bin/coverage.pl -d` for a detailed breakdown.
8085

81-
We are aware of 251 territories
82-
We have at least one test for 251 (100%) territories
83-
We have rules for 251 (100%) territories
84-
0 (0%) territories have neither rules nor tests
85-
86-
This output is generated by `bin/coverage.pl`
86+
The list of all known territories is in `conf/country_codes.yaml`.
8787

88-
We need more language specific abbreviations. Please see `conf/abbreviations`. Pull requests gladly received.
88+
> **Note:** The list contains all officially assigned [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Officially_assigned_code_elements). This is not a political statement about the status of any territory.
8989
90-
A detailed breakdown of test and configuration coverage can be found by running `bin/coverage.pl -d`. A list of all known territories is in `conf/country_codes.yaml`
90+
**We need more language-specific abbreviations.** See `conf/abbreviations`. Pull requests welcome!
9191

92-
_Please note: the list is simple all officially assigned [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#Officially_assigned_code_elements), and is not a political statement on whether or not these territories are or are not or should or should not be political states._
92+
## File Format
9393

94-
### File format
94+
- **Configuration:** [YAML](http://yaml.org/) format
95+
- **Templates:** [Mustache](http://mustache.github.io/) with one variation: `{#first}` sections take the first alternative for which a variable could be interpolated
9596

96-
The files are in [YAML](http://yaml.org/) format. The templates are written in [Mustache](http://mustache.github.io/) with a minor variation: the `{#first}` sections will take the first alternative for which a variable could be interpolated. Both formats are human readable, strict, solve escaping and support comments. YAML allows references (called "ankers") to avoid copy&paste, Mustache allows sub-templates (called "partials").
97+
Both formats are human-readable, strict, handle escaping, and support comments. YAML allows references ("anchors") to avoid duplication; Mustache allows sub-templates ("partials").
9798

98-
### How to add your country/territory
99+
## How to Add Your Country/Territory
99100

100-
1. edit the .yaml testcase for the country/territory in `testcases/countries`. The file names correspond to the appropriate ISO 3166-1 alpha-2 code - see `conf/country_codes.yaml`
101-
* a good way to get sample data is:
102-
* find an addressed location (house, business, etc) in your
103-
target territory in OpenStreetMap
104-
* get the coordinates (lat, long) of the location
105-
* put the coordinates into the [OpenCage Geocoding API demo page](https://opencagedata.com/demo)
106-
* look at the resulting JSON in the *Raw Response* tab
101+
### Step 1: Create Test Cases
107102

108-
2. edit `conf/countries/worldwide.yaml`
109-
* Possibly your country/territory uses an existing generic format as
110-
defined at the top of the file. If so, great! Just map your
111-
country_code to the generic template. You may still want to add
112-
clean up code (see the entry for `DE` as an example).
113-
* If not, you need to define a new rule set (may or may not be generic)
114-
* You may also need to define new state/region mappings in `conf/state_codes.yaml`
103+
Edit the `.yaml` testcase for the country/territory in `testcases/countries`. File names correspond to [ISO 3166-1 alpha-2 codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) (see `conf/country_codes.yaml`).
115104

116-
3. to test you will now need to process the .yaml test via a processor
117-
(see above) and ensure the input leads to the desired output.
118-
We also run these checks automatically against pull requests to ensure against regressions.
105+
**To get sample data:**
106+
1. Find an addressed location (house, business, etc.) in your target territory on [OpenStreetMap](https://www.openstreetmap.org)
107+
2. Get the coordinates (lat, long)
108+
3. Enter the coordinates into the [OpenCage Geocoding API demo](https://opencagedata.com/demo)
109+
4. Check the resulting JSON in the *Raw Response* tab
119110

120-
If in doubt, please get in touch by submitting an issue.
111+
### Step 2: Define Formatting Rules
121112

122-
### Formatting rules
113+
Edit `conf/countries/worldwide.yaml`:
123114

124-
Currently we support the following formatting rules:
115+
- **If your territory uses an existing generic format** (defined at the top of the file): map your `country_code` to the generic template. You may still want to add cleanup code (see the `DE` entry as an example).
116+
- **If not**: define a new rule set (which may or may not be generic). You may also need to define new state/region mappings in `conf/state_codes.yaml`.
125117

126-
* `replace:` regex that operates on the input values, useful for removing bureaucratic cruft like "London Borough of ". Note if you define the regex starting with format _X=_, for example _city=_ it should operate only on values with that key
127-
* `postformat_replace:` regex that operates on the final output
128-
* `add_component:` with a value of the form `component=XXXX`
129-
* `change_country:` change the country value of the input, useful for dependent territories. Can include a substitution like `$state` so that that component value is then inserted into the new country value. See `testcases/countries/sh.yaml` for an example.
130-
* `use_country:` use the formating configuration of another country, useful for dependent territories to avoid duplicating configuration
118+
### Step 3: Test
131119

132-
### The future
120+
Process the `.yaml` test via a processor (see above) and verify the input produces the desired output. We run these checks automatically against pull requests to prevent regressions.
133121

134-
More tests! For every rule about addresses there are exceptions and edge cases to consider. More test cases are always needed.
122+
**Questions?** Submit an issue.
135123

136-
Planned features:
124+
## Formatting Rules
137125

138-
* basic error checking, for example ignore things which obviously can not be postcodes
139-
* define rules for postcode format specifically
126+
| Rule | Description |
127+
|------|-------------|
128+
| `replace:` | Regex operating on input values. Useful for removing bureaucratic cruft like "London Borough of". Prefix with `key=` (e.g., `city=`) to operate only on that key. |
129+
| `postformat_replace:` | Regex operating on the final output. |
130+
| `add_component:` | Add a component with format `component=XXXX`. |
131+
| `change_country:` | Change the country value of the input. Useful for dependent territories. Supports substitutions like `$state`. See `testcases/countries/sh.yaml` for an example. |
132+
| `use_country:` | Use the formatting configuration of another country. Useful for dependent territories to avoid duplicating configuration. |
140133

141-
We welcome your pull requests. Together we can address the world!
134+
## Roadmap
142135

143-
### License
136+
More tests are always needed. For every rule about addresses there are exceptions and edge cases.
144137

145-
This project is licensed under the MIT License - see the [LICENSE.txt](LICENSE.txt) file for details
138+
**Planned features:**
139+
- Basic error checking (e.g., ignore values that obviously cannot be postcodes)
140+
- Rules for postcode format validation
146141

147-
### Additional resources
142+
We welcome your pull requests. Together we can address the world!
148143

149-
If you are working with addresses you may need [lists of random addresses/postcodes/coordinates](https://opencagedata.com/tools/address-lists) (either in general or for specific countries) for testing.
144+
## License
150145

151-
### Further reading on the challenge of address
146+
MIT License - see [LICENSE.txt](LICENSE.txt) for details.
152147

153-
Here's [our blog post anouncing this project](https://blog.opencagedata.com/post/99059889253/good-looking-addresses-solving-the-berlin-berlin) and the motivations behind it.
148+
## Resources
154149

155-
You may enjoy Michael Tandy's [Falsehoods Programmers Believe about Addresses](http://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/).
150+
### Testing Data
156151

157-
If it's actual address data you're after, check out [OpenStreetMap](https://www.openstreetmap.org) and [OpenAddresses](http://openaddresses.io/).
152+
[Lists of random addresses/postcodes/coordinates](https://opencagedata.com/tools/address-lists) for testing (general or country-specific).
158153

159-
If you want to turn longitude, latitude into well formatted addresses or placenames, well that's what a geocoder does. Check out ours: [OpenCage Geocoder](https://opencagedata.com).
154+
### Further Reading
160155

161-
If all this convinces you that address are evil, please check out [what3words](http://what3words.com) which allows you to dispense with them entirely.
156+
- [Our blog post announcing this project](https://blog.opencagedata.com/post/99059889253/good-looking-addresses-solving-the-berlin-berlin) and the motivations behind it
157+
- [Falsehoods Programmers Believe about Addresses](http://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/) by Michael Tandy
162158

163-
### Who is OpenCage GmbH?
159+
### Related Projects
164160

165-
<a href="https://opencagedata.com"><img src="opencage_logo_300_150.png"></a>
161+
- [OpenStreetMap](https://www.openstreetmap.org) - Open address data
162+
- [OpenAddresses](http://openaddresses.io/) - Open address data
163+
- [OpenCage Geocoder](https://opencagedata.com) - Convert coordinates to formatted addresses
164+
- [what3words](http://what3words.com) - An alternative to traditional addresses
166165

167-
We run a worldwide [geocoding API](https://opencagedata.com/api) and [geosearch](https://opencagedata.com/geosearch) service based on open data.
168-
Learn more [about us](https://opencagedata.com/about).
166+
---
169167

170-
We also organize [Geomob](https://thegeomob.com), a series of regular meetups for location based service creators, where we do our best to highlight geoinnovation. If you like geo stuff, you will probably enjoy [the Geomob podcast](https://thegeomob.com/podcast/).
168+
## About OpenCage GmbH
171169

170+
<a href="https://opencagedata.com"><img src="opencage_logo_300_150.png" alt="OpenCage logo"></a>
172171

172+
We run a worldwide [geocoding API](https://opencagedata.com/api) and [geosearch](https://opencagedata.com/geosearch) service based on open data. [Learn more about us](https://opencagedata.com/about).
173173

174+
We also organize [Geomob](https://thegeomob.com), a series of regular meetups for location-based service creators. If you like geo stuff, check out [the Geomob podcast](https://thegeomob.com/podcast/).

0 commit comments

Comments
 (0)