Skip to content

PHPLIB-1323 Implement unlink for GridFS stream wrapper #1206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ Supported stream functions:
- :php:`file() <file>`
- :php:`fopen() <fopen>` with "r", "rb", "w", and "wb" modes
- :php:`rename() <rename>` rename all revisions of a file in the same bucket
- :php:`unlink() <unlink>` remove all revisions of a file in the same bucket

In read mode, the stream context can contain the option ``gridfs['revision']``
to specify the revision number of the file to read. If omitted, the most recent
Expand Down Expand Up @@ -86,6 +87,8 @@ Each call to these functions makes a request to the server.

echo file_get_contents('gridfs://mybucket/hello.txt');

unlink('gridfs://mybucket/hello.txt');

The output would then resemble:

.. code-block:: none
Expand Down
23 changes: 21 additions & 2 deletions psalm-baseline.xml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@
<code>$context</code>
</MixedArgumentTypeCoercion>
</file>
<file src="src/GridFS/CollectionWrapper.php">
<MixedAssignment>
<code>$ids[]</code>
</MixedAssignment>
</file>
<file src="src/GridFS/ReadableStream.php">
<MixedArgument>
<code><![CDATA[$currentChunk->n]]></code>
Expand All @@ -93,12 +98,26 @@
</file>
<file src="src/GridFS/StreamWrapper.php">
<InvalidArgument>
<code>$context</code>
<code>$context</code>
<code><![CDATA[$this->getContext($path, $mode)]]></code>
<code><![CDATA[$this->getContext($path, $mode)]]></code>
</InvalidArgument>
<InvalidReturnStatement>
<code>$context</code>
</InvalidReturnStatement>
<InvalidReturnType>
<code>array{collectionWrapper: CollectionWrapper, file: object}|array{collectionWrapper: CollectionWrapper, filename: string, options: array</code>
</InvalidReturnType>
<MixedArgument>
<code><![CDATA[$context['file']->filename]]></code>
<code><![CDATA[$context['file']->filename]]></code>
</MixedArgument>
<MixedAssignment>
<code>$context</code>
</MixedAssignment>
<PossiblyUndefinedArrayOffset>
<code><![CDATA[$context['file']]]></code>
<code><![CDATA[$context['file']]]></code>
</PossiblyUndefinedArrayOffset>
</file>
<file src="src/Model/BSONArray.php">
<MixedAssignment>
Expand Down
23 changes: 23 additions & 0 deletions src/GridFS/CollectionWrapper.php
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,29 @@ public function deleteChunksByFilesId($id): void
$this->chunksCollection->deleteMany(['files_id' => $id]);
}

/**
* Delete all GridFS files and chunks for a given filename.
*/
public function deleteFileAndChunksByFilename(string $filename): int
{
/** @var iterable<array{_id: mixed}> $files */
$files = $this->findFiles(['filename' => $filename], [
'typeMap' => ['root' => 'array'],
'projection' => ['_id' => 1],
]);

/** @var list<mixed> $ids */
$ids = [];
foreach ($files as $file) {
$ids[] = $file['_id'];
}

$count = $this->filesCollection->deleteMany(['_id' => ['$in' => $ids]])->getDeletedCount();
$this->chunksCollection->deleteMany(['files_id' => ['$in' => $ids]]);

return $count ?? 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is zero in case of WriteConcern=0 or not file to update.

}

/**
* Deletes a GridFS file and related chunks by ID.
*
Expand Down
16 changes: 0 additions & 16 deletions src/GridFS/ReadableStream.php
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,12 @@
use MongoDB\Driver\CursorInterface;
use MongoDB\Exception\InvalidArgumentException;
use MongoDB\GridFS\Exception\CorruptFileException;
use MongoDB\GridFS\Exception\LogicException;

use function assert;
use function ceil;
use function floor;
use function is_integer;
use function is_object;
use function is_string;
use function property_exists;
use function sprintf;
use function strlen;
Expand Down Expand Up @@ -178,20 +176,6 @@ public function readBytes(int $length): string
return $data;
}

/**
* Rename all revisions of the file.
*/
public function rename(string $newFilename): bool
{
if (! isset($this->file->filename) || ! is_string($this->file->filename)) {
throw new LogicException('Cannot rename file without a filename');
}

$this->collectionWrapper->updateFilenameForFilename($this->file->filename, $newFilename);

return true;
}

/**
* Seeks the chunk and buffer offsets for the next read operation.
*
Expand Down
84 changes: 49 additions & 35 deletions src/GridFS/StreamWrapper.php
Original file line number Diff line number Diff line change
Expand Up @@ -106,15 +106,15 @@ public function rename(string $fromPath, string $toPath): bool
}

try {
$this->stream_open($fromPath, 'r', 0, $openedPath);
$context = $this->getContext($fromPath, 'r');
} catch (FileNotFoundException $e) {
return false;
}

$newName = explode('/', $toPath, 4)[3] ?? '';
assert($this->stream instanceof ReadableStream);
$newFilename = explode('/', $toPath, 4)[3] ?? '';
$context['collectionWrapper']->updateFilenameForFilename($context['file']->filename, $newFilename);

return $this->stream->rename($newName);
return true;
}

/**
Expand Down Expand Up @@ -170,41 +170,12 @@ public function stream_eof(): bool
*/
public function stream_open(string $path, string $mode, int $options, ?string &$openedPath): bool
{
$context = [];

/**
* The Bucket methods { @see Bucket::openUploadStream() } and { @see Bucket::openDownloadStreamByFile() }
* always set an internal context. But the context can also be set by the user.
*/
if (is_resource($this->context)) {
$context = stream_context_get_options($this->context)['gridfs'] ?? [];

if (! is_array($context)) {
throw LogicException::invalidContext($context);
}
}

// When the stream is opened using fopen(), the context is not required, it can contain only options.
if (! isset($context['collectionWrapper'])) {
$bucketAlias = explode('/', $path, 4)[2] ?? '';

if (! isset(self::$contextResolvers[$bucketAlias])) {
throw LogicException::bucketAliasNotRegistered($bucketAlias);
}

$context = self::$contextResolvers[$bucketAlias]($path, $mode, $context);
}

if (! $context['collectionWrapper'] instanceof CollectionWrapper) {
throw LogicException::invalidContextCollectionWrapper($context['collectionWrapper']);
}

if ($mode === 'r' || $mode === 'rb') {
return $this->initReadableStream($context);
return $this->initReadableStream($this->getContext($path, $mode));
}

if ($mode === 'w' || $mode === 'wb') {
return $this->initWritableStream($context);
return $this->initWritableStream($this->getContext($path, $mode));
}

throw LogicException::openModeNotSupported($mode);
Expand Down Expand Up @@ -324,6 +295,14 @@ public function stream_write(string $data): int
return $this->stream->writeBytes($data);
}

public function unlink(string $path): bool
{
$context = $this->getContext($path, 'r');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bucket::resolveStreamContext() executes findFileByFilenameAndRevision() for read modes in order to fetch the file document. Would it make sense to use 'w' here and avoid the unnecessary query?

I understand rename() needs to use 'r' since it does read the filename from the file document; however, there may be a refactoring opportunity there if you'd just like to trust the $fromPath string. I suppose it's a matter of intentionally NOP-ing on a FileNotFoundException or executing an update query that may not match anything -- so I'd be find leaving rename() as-is.

Copy link
Member Author

@GromNaN GromNaN Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of using r mode was to check that the file exists before doing operations, to throw a FileNotFoundException. This could be optimized by checking the number of updated documents after the update, but that's not feasible if Write Concern = 0 (edge case).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted to throw a FileNotFoundException in rename and unlink, instead of silently returning false. This behavior is consistent with S3 stream wrapper.

$count = $context['collectionWrapper']->deleteFileAndChunksByFilename($context['file']->filename);

return $count > 0;
}

/** @return false|array */
public function url_stat(string $path, int $flags)
{
Expand All @@ -338,6 +317,41 @@ public function url_stat(string $path, int $flags)
return $this->stream_stat();
}

/** @return array{collectionWrapper: CollectionWrapper, file: object}|array{collectionWrapper: CollectionWrapper, filename: string, options: array */
private function getContext(string $path, string $mode): array
{
$context = [];

/**
* The Bucket methods { @see Bucket::openUploadStream() } and { @see Bucket::openDownloadStreamByFile() }
* always set an internal context. But the context can also be set by the user.
*/
if (is_resource($this->context)) {
$context = stream_context_get_options($this->context)['gridfs'] ?? [];

if (! is_array($context)) {
throw LogicException::invalidContext($context);
}
}

// When the stream is opened using fopen(), the context is not required, it can contain only options.
if (! isset($context['collectionWrapper'])) {
$bucketAlias = explode('/', $path, 4)[2] ?? '';

if (! isset(self::$contextResolvers[$bucketAlias])) {
throw LogicException::bucketAliasNotRegistered($bucketAlias);
}

$context = self::$contextResolvers[$bucketAlias]($path, $mode, $context);
}

if (! $context['collectionWrapper'] instanceof CollectionWrapper) {
throw LogicException::invalidContextCollectionWrapper($context['collectionWrapper']);
}

return $context;
}

/**
* Returns a stat template with default values.
*/
Expand Down
15 changes: 15 additions & 0 deletions tests/GridFS/StreamWrapperFunctionalTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
use function stream_context_create;
use function stream_get_contents;
use function time;
use function unlink;
use function usleep;

use const SEEK_CUR;
Expand Down Expand Up @@ -387,4 +388,18 @@ public function testRenamePathMismatch(): void

rename('gridfs://bucket/filename', 'gridfs://other/newname');
}

public function testUnlinkAllRevisions(): void
{
$this->bucket->registerGlobalStreamWrapperAlias('bucket');
$path = 'gridfs://bucket/path/to/filename';

file_put_contents($path, 'version 0');
file_put_contents($path, 'version 1');

$result = unlink($path);

$this->assertTrue($result);
$this->assertFalse(file_exists($path));
}
}