Skip to content

handle case where user closes browser window before file is sent #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jjdunn opened this issue Oct 11, 2024 · 6 comments
Closed

handle case where user closes browser window before file is sent #28

jjdunn opened this issue Oct 11, 2024 · 6 comments

Comments

@jjdunn
Copy link

jjdunn commented Oct 11, 2024

problem:

  1. user requests a file to be downloaded
  2. file-download-link (or button) opens a new tab for the file-download target
  3. user closes the new tab before the file is downloaded
  4. tmp/File::send() does readfile() - but this fails since the browser connection has been closed
  5. readfile() throws an uncatchable Exception at this point - the only place it can be handled is the shutdown function. I tried various try/catch blocks around readfile(), none of which got invoked.
  6. after the shutdown function executes, tmp::__destruct() runs, and this fails because the file is locked by the web server, left over from readfile()
  7. unlink() in __destruct() throws a "permission denied" error

At first I wrote a (working) solution in my shutdown function, which checks for leftover open file resources of type "stream", and unlocks them; but this solution runs on every request so it's very high overhead to solve an edge-case error

solution I like better, in tmp/File::send():

        //readfile($this->_fileName);
        $fpointer = fopen($this->_fileName, 'r');
        $content = fread($fpointer, filesize($this->_fileName));
        fclose($fpointer);
        echo $content;

this works exactly the same as readfile() under normal conditions, but if the user closes the browser window, the temp files can be deleted since the lock has been released.

p.s. all of this on Windows/Apache httpd 2.4.62 with PHP 8.3.11. Maybe it would work differently on Linux or a different web server.

@mikehaertl
Copy link
Owner

As you may have seen this library is basically in low maintenance mode for a while mainly because I don't do much PHP development anymore these days. I'm a bit hesitant to do any changes unless I'm very sure it won't break anything.

So just few comments:

  • Your fread() based solution has the drawback that the full file is now loaded into memory. This could be a problem if used with large files. readfile() does not have this issue which is why I chose it.
  • You say that readfile() throws an uncatchable Exception in your case. How did you notice that? Can you maybe paste such an exception?
  • How did you actually find out about the file being locked? Do you know when PHP usually would unlock the file and why this doesn't happen in this case?

Reading the manual page for readfile this also looks interesting:

Note:
readfile() will not present any memory issues, even when sending large files, on its own. If you encounter an out of memory error ensure that output buffering is off with ob_get_level().

Did you play around with turning output buffering off before sending the file? I can imagine that this may affect how the file is locked. Maybe the output buffer is never flushed properly when the connection is closed and this is why readfile() does not immediately fail when the connection is closed.

@jjdunn
Copy link
Author

jjdunn commented Oct 12, 2024

Hi Mike,

Thanks for your very thoughtful and detailed response. I really appreciate you taking time.

Your fread() based solution has the drawback that the full file is now loaded into memory. This could be a problem if used with large files. readfile() does not have this issue which is why I chose it.

ah - I didn't think of that, and it's a very good point: some of the PDF files we generate are very large - could be a few hundred pages.

You say that readfile() throws an uncatchable Exception in your case. How did you notice that?

the original artifact I was troubleshooting, was a pair of errors (notice the timestamps are identical):

  1. in the application log
Sep 28 13:48:17 [190.208.56.170] [php                           ] : 
<E> unlink(C:\Windows\Temp\tmpD680.tmp.pdf): 
Permission denied (C:\web\prod\app\extensions\mikehaertl\tmp\File.php:73)
Stack trace:
#0 C:\web\prod\app\extensions\mikehaertl\tmp\File.php(73): unlink()
#1 unknown(0): mikehaertl\tmp\File->__destruct()
REQUEST_URI=/report/atroster
  1. in the apache error.log
[28-Sep-2024 13:48:17 America/Los_Angeles] PHP Warning:  
Cannot modify header information - headers already sent by 
(output started at C:\web\prod\app\extensions\mikehaertl\tmp\File.php:129) 
in C:\web\prod\framework\yiilite.php on line 3125

How did you actually find out about the file being locked?

digging into this problem, I was able to reproduce it in my development environment by:

  • setting a breakpoint just before the readfile() in tmp/File::send()
  • closing the browser tab which is expecting the PDF output
  • stepping into readfile() which immediately goes into the function named in register_shutdown_function()
  • the shutdown function does error_get_last() which is empty
  • after the shutdown function completes (it's just an error-logger), the code continues with tmp/File::__destruct() which tries unlink() which fails as per the above error
  • our application error-handling code then tries to ouput the error and stack-trace from __destruct(), which results in the "headers already sent" problem - headers were sent in File::send().

Just before the unlink(), I used windows Resource Monitor to check which processes had a lock on the temp PDF file, and found it was Apache httpd. So my working theory is that:

  • readfile() locks the file (why?)
  • readfile() fails because the browser is no longer connected.
  • it throws some kind of error or exception which I have been unable to catch or identify - wrapping readfile() in a try/catch block never catches the error; and as above, error_get_last() shows nothing. So I can't say what is happening except that execution stops and the shutdown-function gets invoked.

Can you maybe paste such an exception?

no, because I've never been able to catch it.

Do you know when PHP usually would unlock the file and why this doesn't happen in this case?

I assume readfile() would unlock the file after sending the content to the browser.

Did you play around with turning output buffering off before sending the file?

not previously, but I just tried this, in tmp/File::send():

        $this->sendHeaders($headers);
        ob_end_clean();
        readfile($this->_fileName);
        ob_start();

again, if I set a breakpoint on readfile(), then close the browser window, it immediately fails and execution moves to the shutdown function. error_get_last() returns nothing; and when the code gets to tmp/File::__destruct(), the file is still locked (by apache httpd) and it gives "permission denied".


at this point, I have a working solution (as per the ticket description), but I think your comment about memory usage is very important. Given that the error-condition is not very harmful; and the only negative side-effect is a few leftover PDF and HTML files in the temp directory, I'm inclined to keep your original code and just live with the problem.

still very puzzled by two things about readfile() however:

  • what kind of error or exception is thrown, which cannot be caught, and cannot be reported by error_get_last() ?
  • why does it lock the file before reading ?

I might open a core PHP ticket and see what the PHP gurus have to say about these questions...

@jjdunn jjdunn closed this as completed Oct 12, 2024
@jjdunn
Copy link
Author

jjdunn commented Oct 13, 2024

I might open a core PHP ticket and see what the PHP gurus have to say about these questions...

php/php-src#16422

any thoughts?
can the problem be replicated on *nix, or with a different web server like nginx ?

@jjdunn
Copy link
Author

jjdunn commented Oct 13, 2024

Christoph Becker pointed out that calling ignore_user_abort(true); before readfile() prevents the observed behavior. The temp file is deleted; no error is logged.

I'm going to make this change and suggest making a similar change in tmp/File::send().

@mikehaertl
Copy link
Owner

@jjdunn Nice, thanks for the research. I've opened a PR. Could you maybe help testing it?

mikehaertl added a commit that referenced this issue Oct 14, 2024
Issue #28 Add ignoreUserAbort option to remove files if connection closed
@mikehaertl
Copy link
Owner

Released 1.3.0 containing this fix. Thanks for bringing this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants