Skip to content

Use numbered administrative level instead of named one #398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Feb 13, 2015

Conversation

giosh94mhz
Copy link
Contributor

PR for for ticket #392. To sum up: replace every occurrence of region and county (no "r"), with a generic admin level array.

TODOs:

  • delete Region and County from Model namespace;
  • add Admin to Model namespace;
  • update failing test for Providers;
  • fix all providers code;
  • update string formatter;
  • update documentation;
  • squash all commit :)
  • update all tests (USE_CACHED_RESPONSES=false)

I'll like to hear you feedback about:

  1. [minor] In AbstractProvider::getDefaults the admins value defaults to [] to simplify a bit the code. Is it ok?
  2. Is Admin a clear class name or it is better to call it AdminLevel (as in [RFC] Use numbered administrative level instead of named one #392)?
  3. I'm using a simple array for admin levels, where the index is the "depth" in the hierarchy. Is this enough for you?
  4. string formatter is now using %R and %P for Region and County, but this no longer make sense. What should be used? %A1, %A2,%A3,%A4? %A0, %A1, %A2,%A3 (as the array index, not the admin level)? %X, %Y,%Z,%W (as in math :)? Something else?

@willdurand
Copy link
Member

[minor] In AbstractProvider::getDefaults the admins value defaults to [] to simplify a bit the code. Is it ok?

Yes.

Is Admin a clear class name or it is better to call it AdminLevel (as in #392)?

AdminLevel

I'm using a simple array for admin levels, where the index is the "depth" in the hierarchy. Is this enough for you?

I am fine with it. I guess. Can we restrict this depth or it really depends on the location?

string formatter is now using %R and %P for Region and County, but this no longer make sense. What should be used? %A1, %A2,%A3,%A4? %A0, %A1, %A2,%A3 (as the array index, not the admin level)? %X, %Y,%Z,%W (as in math :)? Something else?

%A1, %A2 sounds good to me.

poke @toin0u @Baachi

@giosh94mhz giosh94mhz force-pushed the numbered_admin_levels branch from 890bed9 to 5b0e0db Compare January 12, 2015 11:12
@giosh94mhz
Copy link
Contributor Author

As far as I know (i.e. the Geoname database), the depth is at most 4, and in some rare case is 5. Geonames doesn't support admin level 5 very well, so I suspect is very rare. For now, I'm just migrating the providers code so the maximum level for is 2. I may throw an exception if the admin depth is above 5 and we are safe, though.

I'll rename the class to AdminLevel ASAP.

(PS: I've fixed the mail address of my commit in the message above)

@@ -34,9 +34,9 @@ public function testCreateFromArray()
foreach ($addresses as $address) {
$this->assertInstanceOf('Geocoder\Model\Address', $address);
$this->assertNull($address->getCoordinates());
$this->assertInstanceOf('Geocoder\Model\County', $address->getCounty());
$this->assertInstanceOf('Geocoder\Model\County', $address->getAdmins()[1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assertion is wrong, it should assert than it's an instance of Geocoder\Model\Admin (and AdminLevel after you change it) 😉

Edit: the same change will be needed in a lot of test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've started backward alphabetical order when fixing tests :) I have failing tests from A-GeoIP.

@pyrech
Copy link
Contributor

pyrech commented Jan 12, 2015

Small proposition: What about adding a $level property to Admin?

Also, currently the Address#admins array is 0-based and it looks like admin levels usually take values from 1 to 10 (eg on OpenStreetMap). Should the implementation match this behaviour?

@giosh94mhz
Copy link
Contributor Author

@pyrech I think that the idea of having $level property is good!

I;ve looked into OpenStreetMap docs you supplied, but the value they expose is not the "administrative level" strictly speaking. As you can see, table data are sparse; in particular:

  • level 1 and 2 are to be stripped from admins, and set to Address::$country
  • level n and n-1 (ignoring null values) are City and SubUrbs, which must be set to Address::$city and Address::$subLocality
  • 3..(n-2): the remaining are at most 6 administrative level, but still seems very sparse and can be reduced to 4-5 (need to dig in the docs more).

IMO, the OpenStreetMap approach is too sparse, and must be aligned to something like Geonames or GMap (e.g maps.api/?country=Spain&postalCode=08001).

@pyrech
Copy link
Contributor

pyrech commented Jan 12, 2015

@giosh94mhz Yes, you're completely right about their structure. I was more refering to the fact they start to the level 1 instead of 0 and the GMap example is indeed far better.

Do you think that Address#admins should use real level as key? IMO $address->getAdmins()[X-1] doesn't look nice to get the administrative area level X.

Edit: I just saw that you were already requesting feedback on this aspect, so let's see what the core team think about that! 😄 In addition, maybe a method Address#getAdmin(int $level) could be added..?

@giosh94mhz
Copy link
Contributor Author

@pyrech Ok, now I got it :)

I though that it was a bit ugly the first time I wrote getAdmins()[0] :) . Anyway, using $address->getAdmins()[X] instead of X-1 seems ok to me, probably is even better.

An alternative may be to have a syntactic sugar like:

public function getAdmins()
{
    return $this->admins;
}
public function getAdmin($level)
{
    return isset($this->admins[$level-1]) ? $this->admins[$level-1] : null /* or maybe null obj if $level <= 4 or 5 */;
}

@pyrech
Copy link
Contributor

pyrech commented Jan 12, 2015

@giosh94mhz Hehe, we came to the same conclusion (I edited my previous comment in the same time)! :)

@giosh94mhz
Copy link
Contributor Author

Ok, I've renamed Admin to AdminLevel everywhere... it took me a while.

I've also used 1 based arrays, and they feel more natural when accessing specific level, but they are sometimes boring to write in provider (e.g. [1 => ['name' => 'admin level']]).

I'm trying to forcing a limit to admin level [1,5] wherever possible. I'm also checking the api docs of some provider, to see if there is support for multiple admin levels (2+).

Along the way I find some minor bugs as in commit 91b2e0d. @willdurand may I add it here (and not squashing in the end), or you prefer me to create another PR?

I'll wait for your feedback about getAdminLevel($level), in the meanwhile I keep up the work with the providers.

);
}

ksort($adminLevels, SORT_NUMERIC);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? I see no reason for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've added a test which expose this, but here the PHP magic to solve:

$admins = [];
$admins[2] = 'admin 2';
$admins[1] = 'admin 1';

echo implode(', ', $admins) . "\n";
foreach ($admins as $level => $admin) {
    echo $level . ': ' . $admin . "\n";
}

expected:

admin1, admin2
1: admin 1
2: admin 2

actual:

admin2, admin1
2: admin 2
1: admin 1

Since Providers deal with external data, I think is much safer to sort it out explicitely. Anyway, all of this is solved by introducing the AdminLevelCollection class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, didn't see this test. Anyway i agree with you, this is more logical 👍

@Baachi
Copy link
Member

Baachi commented Jan 13, 2015

Really great work @giosh94mhz.

[minor] In AbstractProvider::getDefaults the admins value defaults to [] to simplify a bit the code. Is it ok?

Yes

I'm using a simple array for admin levels, where the index is the "depth" in the hierarchy. Is this enough for you?

Its enough, but i would enforce a string cast, so PHP does not made any magic with this array.

string formatter is now using %R and %P for Region and County, but this no longer make sense. What should be used? %A1, %A2,%A3,%A4? %A0, %A1, %A2,%A3 (as the array index, not the admin level)? %X, %Y,%Z,%W (as in math :)? Something else?

I would tend to %A1, %A2, %A3, ...

@toin0u
Copy link
Member

toin0u commented Jan 13, 2015

Nice work @giosh94mhz !!

I'm using a simple array for admin levels [..]

We can maybe use a AdminLevelCollection like https://github.com/geocoder-php/Geocoder/blob/master/src/Geocoder/Model/AddressCollection.php ?

string formatter is now using %R and %P for Region and County, but this no longer make sense. What should be used? %A1, %A2,%A3,%A4? %A0, %A1, %A2,%A3 (as the array index, not the admin level)? %X, %Y,%Z,%W (as in math :)? Something else?

I agree with @Baachi

@willdurand
Copy link
Member

+1 for the collection

+1 for adding commits to this PR, we will see how to deal with all the commits later

@willdurand
Copy link
Member

thank you for making this real!

@giosh94mhz
Copy link
Contributor Author

Thanks @Baachi, @toin0u and @willdurand for your feedback and support!

Its enough, but I would enforce a string cast, so PHP does not made any magic with this array.

PHP is full of magic, which one in particular? :) String are not so comfortable to use, since I'm actually dealing with numbers and have some checks for array bounds.

We can maybe use a AdminLevelCollection like https://github.com/geocoder-php/Geocoder/blob/master/src/Geocoder/Model/AddressCollection.php ?

I'm also 👍 for this, now that I think of it.
The AdminLevelCollection should be:

  • sorted by depth, so that the results can be iterated easily;
  • key (i.e. level) are within [1,5], throw OutOfBound otherwise;
  • if two different AdminLevel instance are added for the same level (during contruction at least) an InvalidArgument must be thrown.
    This way, some checks are removed from the factory, which didn't fit well.

I'll keep index-based array for the internal result data.

@Baachi
Copy link
Member

Baachi commented Jan 13, 2015

PHP is full of magic, which one in particular? :) String are not so comfortable to use, since I'm actually dealing with numbers and have some checks for array bounds.

I now, but i have some strange issues with it (i don't remember exactly which one). But if we decide to introduce an AdminLevelCollection it will be safer for the user land anyways.

@Baachi
Copy link
Member

Baachi commented Jan 13, 2015

I would tend to use the SplFixedArray which already throw an exception if more than x elements are given.

@giosh94mhz
Copy link
Contributor Author

@Baachi I didn't know about SplFixedArray.

Here's my thought

  • PRO:
    • array is index-sorted as in C;
    • class definition is almost a 1-line alias of SplFixedArray
  • CONS:
    • since it's fixed, count is always 5, but countries may have less admin level (e.g. Italy got 2-3 level)
    • null value are interleaved with object value, so it can be unconfortable when looping
    • I need to catch and re-throw RuntimeException exceptins, if we want to have all exception in Geocoder namespace

So I think I'll create a new class and eventually try to use SplFixedArray as the inner instance.

What do you think?

@Baachi
Copy link
Member

Baachi commented Jan 15, 2015

The performance should be irrelevant. I would it only use if you have less code to write and its easier to maintain. If its more work, build your own :)

@giosh94mhz
Copy link
Contributor Author

From what I've read, SplFixedArray is meant to be (and seems to be) faster than regular array... anyway, for 5 element it doesn't matter.

This evening I'll work a bit more... I'll keep you up to date.

@giosh94mhz
Copy link
Contributor Author

Here I am! :) I've finished testing the providers, but I may need some help with testing which use api-keys... Hopefully travis will help.

About the SplFixedArray. I've tried integrating that class but is not very handy, because:

  • it's read/write, not read-only so I need to wrap it anyway;
  • count is always 5, no matter what;
  • foreach always loop all 5 elements;
  • index start from zero, so $i == $level - 1;
  • interface is way different from AddressCollection;

Along the way, I have:

  • improved Geonames and GoogleMap providers which now support all five admin levels. I'll try to do the same with the others.
  • fixed the type hint for `Geocoder::geocode' method.

I think there is a more room for improvements. We may:

  1. rename *Collection::all as *Collection::toArray (which is "forward compatible" with DoctrineCommon and other collections, if you ever decide to switch), since it's more clear and the class itself is Traversable;
  2. implement ArrayAccess interface in both collections (keeping the get method as an alias of offsetGet is good IMO);
  3. if not implementing (2) add an has method, and throw when getting non existing keys.

@willdurand
Copy link
Member

rename *Collection::all as *Collection::toArray (which is "forward compatible" with DoctrineCommon and other collections, if you ever decide to switch), since it's more clear and the class itself is Traversable;

👎 I like Symfony "ParameterBag" API.

implement ArrayAccess interface in both collections (keeping the get method as an alias of offsetGet is good IMO);

👎 I'd like to keep collections read-only as much as possible

if not implementing (2) add an has method, and throw when getting non existing keys.

What is the current impl of get?

@giosh94mhz
Copy link
Contributor Author

I like Symfony "ParameterBag" API.

Ok, let's keep it like that.

I'd like to keep collections read-only as much as possible

Me too. I've avoided read-write interface for this patch.

What is the current impl of get?

AddressCollection::get return an address object or throw when index is greater than count.
But for AdminLevelCollection this may not be enough, since in (rare) situation there may be admin1 and admin3 level, without admin2 (i.e. count = 2, but admin3 is available). So I now return OutOfBounds when index is greater than 5 and null when the admin level is not available. The has method will simplify this implementation, and always throw when a level is not set.

@giosh94mhz giosh94mhz force-pushed the numbered_admin_levels branch from e62a4b5 to ef2cad0 Compare January 30, 2015 08:48
@giosh94mhz
Copy link
Contributor Author

Ok, I've finally got all the api keys and tested all the providers with real data. I've fixed the commit history (all commit have passing tests), so that fixes I've fond along the way have a separate commit from the new admin feature.

Testing are now passing, but if I set USE_CACHED_RESPONSES=false some tests are failing. This is probably not related to my changes, but I'll try to fix them anyway.

@willdurand I've fixed the collection interfaces as in ParametersBag, please give it a quick review when you have the time.

@toin0u
Copy link
Member

toin0u commented Jan 30, 2015

Wow ! Amazing job ! I'll review it as well asap :)

@toin0u
Copy link
Member

toin0u commented Feb 3, 2015

👍

@giosh94mhz giosh94mhz force-pushed the numbered_admin_levels branch from f560078 to 990ab98 Compare February 11, 2015 09:01
@giosh94mhz
Copy link
Contributor Author

Rebased on master :)

));
];

for ($level = 1; $level <= AdminLevelCollection::MAX_LEVEL_DEPTH; ++ $level) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to write $level++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. But don't tell C++ developers I did it :P

@willdurand
Copy link
Member

Last nitpicks. Time to get this merged!

@giosh94mhz
Copy link
Contributor Author

👍 done!

willdurand added a commit that referenced this pull request Feb 13, 2015
Use numbered administrative level instead of named one
@willdurand willdurand merged commit 6a7fbc4 into geocoder-php:master Feb 13, 2015
@willdurand
Copy link
Member

\o/

@toin0u
Copy link
Member

toin0u commented Feb 13, 2015

@giosh94mhz Awesome!!! 🎆

@willdurand
Copy link
Member

@Baachi
Copy link
Member

Baachi commented Feb 13, 2015

Really cool! 🎆

@giosh94mhz
Copy link
Contributor Author

Thank you everyone! You gave me great support and feedback!

@lupoair lupoair mentioned this pull request Jun 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants