Skip to content

Refactor CachingAssetReader to FilesystemCache. #3869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions build/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
- Refactor `MultiPackageAssetReader` to internal `AssetFinder`.
- Add internal `Filesystem` that backs `AssetReader` and `AssetWriter`
implementations.
- Refactor `CachingAssetReader` to `FilesystemCache`.

## 2.4.2

Expand Down
14 changes: 14 additions & 0 deletions build/lib/src/builder/build_step_impl.dart
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import '../resource/resource.dart';
import '../state/asset_finder.dart';
import '../state/asset_path_provider.dart';
import '../state/filesystem.dart';
import '../state/filesystem_cache.dart';
import '../state/input_tracker.dart';
import '../state/reader_state.dart';
import 'build_step.dart';
Expand Down Expand Up @@ -83,6 +84,19 @@ class BuildStepImpl implements BuildStep, AssetReaderState {
_stageTracker = stageTracker ?? NoOpStageTracker.instance,
_reportUnusedAssets = reportUnusedAssets;

@override
BuildStepImpl copyWith({FilesystemCache? cache}) => BuildStepImpl(
inputId,
allowedOutputs,
_reader.copyWith(cache: cache),
_writer,
_resolvers,
_resourceManager,
_resolvePackageConfig,
stageTracker: _stageTracker,
reportUnusedAssets: _reportUnusedAssets,
);

@override
Filesystem get filesystem => _reader.filesystem;

Expand Down
1 change: 1 addition & 0 deletions build/lib/src/internal.dart
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ library;
export 'state/asset_finder.dart';
export 'state/asset_path_provider.dart';
export 'state/filesystem.dart';
export 'state/filesystem_cache.dart';
export 'state/input_tracker.dart';
export 'state/reader_state.dart';
98 changes: 86 additions & 12 deletions build/lib/src/state/filesystem.dart
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,39 @@ import 'package:pool/pool.dart';

import '../asset/id.dart';
import 'asset_path_provider.dart';
import 'filesystem_cache.dart';

/// The filesystem the build is running on.
///
/// Methods behave as the `dart:io` methods with the same names, with some
/// exceptions noted.
/// exceptions noted in the docs.
///
/// Some methods cache, all uses of the cache are noted in the docs.
///
/// The cache might be a [PassthroughFilesystemCache] in which case it has no
/// effect.
///
/// TODO(davidmorgan): extend caching to sync methods, deletes, writes.
abstract interface class Filesystem {
FilesystemCache get cache;

/// Returns a new instance with optionally updated [cache].
Filesystem copyWith({FilesystemCache? cache});

/// Whether the file exists.
///
/// Uses [cache].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here - and below - it talks about a specific implementation. Isn't that weird to have as documentation for the interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit awkward :) while writing the next PR I realized it makes more sense to leave Filesystem always uncached and do the caching one level up in the reader. So that's the plan.

As the code is I do think it makes sense on the interface, because exactly how the cache is used has visible side effects. Which is also why it makes sense to move it out, it makes this type do too much :)

Future<bool> exists(AssetId id);

/// Reads a file as a string.
///
/// Uses [cache]. For `utf8`, the `String` is cached; for any other encoding
/// the bytes are cached but the conversion runs on every read.
Future<String> readAsString(AssetId id, {Encoding encoding = utf8});

/// Reads a file as bytes.
///
/// Uses [cache].
Future<Uint8List> readAsBytes(AssetId id);

/// Deletes a file.
Expand Down Expand Up @@ -62,23 +86,48 @@ abstract interface class Filesystem {

/// A filesystem using [assetPathProvider] to map to the `dart:io` filesystem.
class IoFilesystem implements Filesystem {
@override
final FilesystemCache cache;

final AssetPathProvider assetPathProvider;

/// Pool for async file operations.
final _pool = Pool(32);

IoFilesystem({required this.assetPathProvider});
IoFilesystem({
required this.assetPathProvider,
this.cache = const PassthroughFilesystemCache(),
});

@override
IoFilesystem copyWith({FilesystemCache? cache}) => IoFilesystem(
assetPathProvider: assetPathProvider,
cache: cache ?? this.cache,
);

@override
Future<bool> exists(AssetId id) => _pool.withResource(_fileFor(id).exists);
Future<bool> exists(AssetId id) =>
cache.exists(id, ifAbsent: () => _pool.withResource(_fileFor(id).exists));

@override
Future<Uint8List> readAsBytes(AssetId id) =>
_pool.withResource(_fileFor(id).readAsBytes);
Future<Uint8List> readAsBytes(AssetId id) => cache.readAsBytes(
id,
ifAbsent: () => _pool.withResource(_fileFor(id).readAsBytes),
);

@override
Future<String> readAsString(AssetId id, {Encoding encoding = utf8}) =>
_pool.withResource(_fileFor(id).readAsString);
Future<String> readAsString(AssetId id, {Encoding encoding = utf8}) async {
// The cache only directly supports utf8, for other encodings get the
// bytes via the cache then convert.
if (encoding == utf8) {
return cache.readAsString(
id,
ifAbsent: () => _pool.withResource(_fileFor(id).readAsString),
);
} else {
return encoding.decode(await readAsBytes(id));
}
}

@override
void deleteSync(AssetId id) {
Expand Down Expand Up @@ -146,17 +195,42 @@ class IoFilesystem implements Filesystem {

/// An in-memory [Filesystem].
class InMemoryFilesystem implements Filesystem {
final Map<AssetId, List<int>> assets = {};
@override
FilesystemCache cache;

final Map<AssetId, List<int>> assets;

InMemoryFilesystem({FilesystemCache? cache})
: cache = cache ?? const PassthroughFilesystemCache(),
assets = {};

InMemoryFilesystem._({required this.cache, required this.assets});

@override
Future<bool> exists(AssetId id) async => assets.containsKey(id);
InMemoryFilesystem copyWith({FilesystemCache? cache}) =>
InMemoryFilesystem._(assets: assets, cache: cache ?? this.cache);

@override
Future<Uint8List> readAsBytes(AssetId id) async => assets[id] as Uint8List;
Future<bool> exists(AssetId id) async =>
cache.exists(id, ifAbsent: () async => assets.containsKey(id));

@override
Future<String> readAsString(AssetId id, {Encoding encoding = utf8}) async =>
encoding.decode(assets[id] as Uint8List);
Future<Uint8List> readAsBytes(AssetId id) async =>
cache.readAsBytes(id, ifAbsent: () async => assets[id] as Uint8List);

@override
Future<String> readAsString(AssetId id, {Encoding encoding = utf8}) async {
// The cache only directly supports utf8, for other encodings get the
// bytes via the cache then convert.
if (encoding == utf8) {
return cache.readAsString(
id,
ifAbsent: () async => encoding.decode(assets[id]!),
);
} else {
return encoding.decode(await readAsBytes(id));
}
}

@override
Future<void> delete(AssetId id) async {
Expand Down
154 changes: 154 additions & 0 deletions build/lib/src/state/filesystem_cache.dart
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

import 'dart:async';
import 'dart:convert';
import 'dart:typed_data';

import '../asset/id.dart';
import 'lru_cache.dart';

/// Cache for file existence and contents.
///
/// TODO(davidmorgan): benchmark, optimize the caching strategy.
abstract interface class FilesystemCache {
/// Clears all [ids] from all caches.
///
/// Waits for any pending reads to complete first.
Future<void> invalidate(Iterable<AssetId> ids);

/// Whether [id] exists.
///
/// Returns a cached result if available, or caches and returns `ifAbsent()`.
Future<bool> exists(AssetId id, {required Future<bool> Function() ifAbsent});

/// Reads [id] as bytes.
///
/// Returns a cached result if available, or caches and returns `ifAbsent()`.
Future<Uint8List> readAsBytes(
AssetId id, {
required Future<Uint8List> Function() ifAbsent,
});

/// Reads [id] as a `String`.
///
/// Returns a cached result if available, or caches and returns `ifAbsent()`.
///
/// The encoding used is always `utf8`. For other encodings, use
/// [readAsBytes].
Future<String> readAsString(
AssetId id, {
required Future<String> Function() ifAbsent,
});
}

/// [FilesystemCache] that always reads from the underlying source.
class PassthroughFilesystemCache implements FilesystemCache {
const PassthroughFilesystemCache();

@override
Future<void> invalidate(Iterable<AssetId> ids) async {}

@override
Future<bool> exists(
AssetId id, {
required Future<bool> Function() ifAbsent,
}) => ifAbsent();

@override
Future<Uint8List> readAsBytes(
AssetId id, {
required Future<Uint8List> Function() ifAbsent,
}) => ifAbsent();

@override
Future<String> readAsString(
AssetId id, {
required Future<String> Function() ifAbsent,
}) => ifAbsent();
}

/// [FilesystemCache] that stores data in memory.
class InMemoryFilesystemCache implements FilesystemCache {
/// Cached results of [readAsBytes].
final _bytesContentCache = LruCache<AssetId, Uint8List>(
1024 * 1024,
1024 * 1024 * 512,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these numbers copied from somewhere?
Do we have any numbers on "normal" usage of these, because 2 x 0.5 GB seems quite high to me...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's copied.

Added a TODO to look into the caching strategy; I suspect we can do something simpler. It would be surprising if we can't fit one build's worth of input in memory ... particularly given that when running with the analyzer, there is exactly another copy of all the inputs in the in-memory filesystem we give to the analyzer :)

Thanks.

(value) => value.lengthInBytes,
);

/// Pending [readAsBytes] operations.
final _pendingBytesContentCache = <AssetId, Future<Uint8List>>{};

/// Cached results of [exists].
///
/// Don't bother using an LRU cache for this since it's just booleans.
final _canReadCache = <AssetId, Future<bool>>{};

/// Cached results of [readAsString].
///
/// These are computed and stored lazily using [readAsBytes].
///
/// Only files read with [utf8] encoding (the default) will ever be cached.
final _stringContentCache = LruCache<AssetId, String>(
1024 * 1024,
1024 * 1024 * 512,
(value) => value.length,
);

/// Pending `readAsString` operations.
final _pendingStringContentCache = <AssetId, Future<String>>{};

@override
Future<void> invalidate(Iterable<AssetId> ids) async {
// First finish all pending operations, as they will write to the cache.
for (var id in ids) {
await _canReadCache.remove(id);
await _pendingBytesContentCache.remove(id);
await _pendingStringContentCache.remove(id);
}
for (var id in ids) {
_bytesContentCache.remove(id);
_stringContentCache.remove(id);
}
}

@override
Future<bool> exists(
AssetId id, {
required Future<bool> Function() ifAbsent,
}) => _canReadCache.putIfAbsent(id, ifAbsent);

@override
Future<Uint8List> readAsBytes(
AssetId id, {
required Future<Uint8List> Function() ifAbsent,
}) {
var cached = _bytesContentCache[id];
if (cached != null) return Future.value(cached);

return _pendingBytesContentCache.putIfAbsent(id, () async {
final result = await ifAbsent();
_bytesContentCache[id] = result;
unawaited(_pendingBytesContentCache.remove(id));
return result;
});
}

@override
Future<String> readAsString(
AssetId id, {
required Future<String> Function() ifAbsent,
}) async {
var cached = _stringContentCache[id];
if (cached != null) return cached;

return _pendingStringContentCache.putIfAbsent(id, () async {
final result = await ifAbsent();
_stringContentCache[id] = result;
unawaited(_pendingStringContentCache.remove(id));
return result;
});
}
}
10 changes: 10 additions & 0 deletions build/lib/src/state/reader_state.dart
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,17 @@ import '../asset/reader.dart';
import 'asset_finder.dart';
import 'asset_path_provider.dart';
import 'filesystem.dart';
import 'filesystem_cache.dart';
import 'input_tracker.dart';

/// Provides access to the state backing an [AssetReader].
extension AssetReaderStateExtension on AssetReader {
/// Returns a new instance with optionally updated [cache].
AssetReader copyWith({FilesystemCache? cache}) {
_requireIsAssetReaderState();
return (this as AssetReaderState).copyWith(cache: cache);
}

Filesystem get filesystem {
_requireIsAssetReaderState();
return (this as AssetReaderState).filesystem;
Expand Down Expand Up @@ -52,6 +59,9 @@ extension AssetReaderStateExtension on AssetReader {

/// The state backing an [AssetReader].
abstract interface class AssetReaderState {
/// Returns a new instance with optionally updated [cache].
AssetReader copyWith({FilesystemCache? cache});

/// The [Filesystem] that this reader reads from.
///
/// Warning: this access to the filesystem bypasses reader functionality
Expand Down
Loading
Loading