Skip to content

Update CachedHttpClient.php #265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 5, 2015
Merged

Update CachedHttpClient.php #265

merged 2 commits into from
May 5, 2015

Conversation

jrean
Copy link
Contributor

@jrean jrean commented Apr 30, 2015

I used your implementation of the filesystem cache.

This issue comes from the use of $path:

protected function createRequest($httpMethod, $path, $body = null, array $headers = array(), array $options = array())

Here:

if ($modifiedAt = $this->getCache()->getModifiedSince($path)) {

And here:

if ($etag = $this->getCache()->getETag($path)) {

And

public function request($path, $body = null, $httpMethod = 'GET', array $headers = array(), array $options = array())

Here:

if (304 == $response->getStatusCode()) {
    $cacheResponse = $this->getCache()->get($path);
    $this->lastCachedResponse = $cacheResponse;
    return $cacheResponse;
}
$this->getCache()->set($path, $response);

$path doesn't take care of the $options array which contains the sub array for the query parameters!
So each time getModifiedSince($path) and getETag($path) will work with an incomplete path because it doesn't know query parameters (if they exists). $path value will always be repos/:user/:repo/method and it should be repos/:user/:repo/method[?:encoded_query_parameters]

See my code modifications to fix the problem.

By applying this patch we can now use the cache as it should be :)

I started this issue a few days ago.
#262

I used your implementation of the filesystem cache.

- Delete all files under the cache directory `github-api-cache`.
- Set properly your authentication method (I used token).
- Run the following code once after a proper Client instantiation with cache:

```
$commits_article_1 = ResponseMediator::getContent($client->getHttpClient()->get("repos/jrean/askjong.com/commits?sha=master&path=articles/hello-world.md"));
$rateLimit = ResponseMediator::getContent($client->getHttpClient()->get("rate_limit"));
var_dump($rateLimit);

$commits_article_2 = ResponseMediator::getContent($client->getHttpClient()->get("repos/jrean/askjong.com/commits?sha=master&path=articles/this-is-a-first-article.md"));
$rateLimit = ResponseMediator::getContent($client->getHttpClient()->get("rate_limit"));
var_dump($rateLimit);
```
4 requests against Github Api:
- 1 query to fetch the commits for the `articles/hello-world.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 1 query to fetch the commits for the `articles/this-is-a-first-article.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 2 queries to fetch the rate_limit (doesn't count again the rate limit). (`Guzzle\Http\Message\Response` status code is 200)

Once this code has run the cache directory `github-api-cache` will have 6 files (3 for each queries including the `rate_limit` query + 3 `.etag` files)

**The `rate_limit` should / must decrease by 2** (a call to `rate_limit` doesn't count against Github).

--

- Run the same code again.

4 requests against Github Api:
- 1 query to fetch the commits for the `articles/hello-world.md` file. (`Guzzle\Http\Message\Response` status code is 304, cache is used)
- 1 query to fetch the commits for the `articles/this-is-a-first-article.md` file. (`Guzzle\Http\Message\Response` status code is 304, cache is user)
- 2 queries to fetch the rate_limit (doesn't count again the rate limit). (`Guzzle\Http\Message\Response` status code is 200)

**The `rate_limit` don't move** (a call to `rate_limit` doesn't count against Github).

--

- Run the following code which use your wrapping methods against Github Api (Notice we are requesting the same informations):
**It should use the cache but it will not.** (I explain why just after).

```
$commits_article_1 = $client->api('repo')->commits()->all('jrean', 'askjong.com', array('sha' => 'master', 'path' => 'articles/hello-world.md'));
$rateLimit = ResponseMediator::getContent($client->getHttpClient()->get("rate_limit"));
var_dump($rateLimit);

$commits_article_2 = $client->api('repo')->commits()->all('jrean', 'askjong.com', array('sha' => 'master', 'path' => 'articles/this-is-a-first-article.md'));
$rateLimit = ResponseMediator::getContent($client->getHttpClient()->get("rate_limit"));
var_dump($rateLimit);
```

4 requests against Github Api:
- 1 query to fetch the commits for the `articles/hello-world.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 1 query to fetch the commits for the `articles/this-is-a-first-article.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 2 queries to fetch the rate_limit (doesn't count again the rate limit). (`Guzzle\Http\Message\Response` status code is 200)

Once this code has run the cache directory `github-api-cache` will have new files...

**The `rate_limit` decreases by 2** (a call to `rate_limit` doesn't count against Github). It should not decrease because we should use the cache!

--

- Run the same code again.

4 requests against Github Api:
- 1 query to fetch the commits for the `articles/hello-world.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 1 query to fetch the commits for the `articles/this-is-a-first-article.md` file. (`Guzzle\Http\Message\Response` status code is 200)
- 2 queries to fetch the rate_limit (doesn't count again the rate limit). (`Guzzle\Http\Message\Response` status code is 200)

**The `rate_limit` decreases by 2 again** (a call to `rate_limit` doesn't count against Github). It should not decrease because we should use the cache!

- Run the same code again and it will still decrease your `rate_limit`...

--

# Solution

This issue comes from the use of `$path`:

`protected function createRequest($httpMethod, $path, $body = null, array $headers = array(), array $options = array())`

Then here:

`if ($modifiedAt = $this->getCache()->getModifiedSince($path)) {`

And:

`if ($etag = $this->getCache()->getETag($path)) {`

`$path` doesn't take care of the `$options` array which contains the sub array for the `query` parameters!
So each time `getModifiedSince($path)` and `getETag($path)` will work with an incomplete path because it doesn't know query parameters (if they exists). In that exemple `$path` value will always be `repos/jrean/askjong.com/commits` and it should be `repos/jrean/askjong.com/commits?sha=master&path=articles/hello-world.md`.

See my code modifications to fix the problem.
(I didn't use `$request->getQuery()` because it returns query parameters but encoded and it is not acceptable because when we want to construct our own queries like `repos/jrean/askjong.com/commits?sha=master&path=articles/hello-world.md` we won't encode parameters)

By applying this patch we can now use the cache as it should be :)

I started this issue a few days ago.
#262
*
* @var string
*/
private $path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't store this in a property. The more state is added in the class, the more complex reasonning about it is (I already think that having getLastResponse in the HttpClientInterface is a big mistake).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and btw, the naming is wrong: the value stored in this property is not the path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stof

please don't store this in a property.

I added it in a property because it is called in two methods ! request() and createRequest(). What do you suggest so? You prefer a private method that returns something called into request() and createRequest() methods? Properties exists to be used and in that class only 2 already exist (plus eventually mine) then 4 for the parent class. This is acceptable don't you think?

"and btw, the naming is wrong: the value stored in this property is not the path"

$path is implemented everywhere and in other places it is called $id...
When I choose $this->path it was to respect what as already been implemented. The original variable was $path as you can see.
How do you want to call $path + query (if query parameters exist)?
By suggesting this pull request I'm not fighting for naming but simply trying to fix something that doesn't work as it should be. I'm totally open for constructive suggestions. What do you suggest so?

Some cleaning according the chat with @stof + format the identifier (`$id`) with encoded query parameters.
@jrean
Copy link
Contributor Author

jrean commented May 1, 2015

@stof Can I have your feedback?

@jrean
Copy link
Contributor Author

jrean commented May 5, 2015

@pilot can we merge it please? (I need the fix)

pilot added a commit that referenced this pull request May 5, 2015
Update CachedHttpClient.php
@pilot pilot merged commit b53ff0c into KnpLabs:master May 5, 2015
@jrean
Copy link
Contributor Author

jrean commented May 5, 2015

Thank you.

@pilot
Copy link
Contributor

pilot commented May 5, 2015

@jrean good catch 👍

@jrean jrean deleted the patch-2 branch May 5, 2015 08:45
@jrean
Copy link
Contributor Author

jrean commented May 5, 2015

@pilot can you release the 1.4.8 with that patch?
https://github.com/GrahamCampbell/Laravel-GitHub/blob/master/composer.json
GrahamCampbell/Laravel-GitHub is using "knplabs/github-api": "~1.3" so he won't get the patch until the next version.

Thank you :)

@pilot
Copy link
Contributor

pilot commented May 5, 2015

@jrean
Copy link
Contributor Author

jrean commented May 5, 2015

You rock!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants