Libpostal: international street address parsing in C trained on OpenStreetMap

  • An amazing effort, many congrats to all involved.

    I work on the address-formatting project, one small piece of the many used here. We currently have formatting rules for 93% of the world's 249 territories (as defined by ISO 3166-1 alpha-2 codes), but we need help to finish things out - especially from people with local knowledge and native speakers. Even for the countries we've "finished" more tests are always useful.

    Here's the repo if you'd like to get involved: https://github.com/OpenCageData/address-formatting

    Here's a post I did a week ago on the regions we need help with, though since then we've started making good progress on Arabic speaking countries. http://blog.opencagedata.com/post/138991962708/an-update-on-...

    Feel free to ping me if you'd like to get involved. Thanks.

  • other open traning dataset : https://openaddresses.io/