Skip to content

Spark score for retrievals using any available provider #254

@bajtos

Description

@bajtos

In https://space-meridian.slack.com/archives/C06RPCL6QGL/p1742802306479569, we discussed how different storage DePINs address the availability of retrievals.

  • In Walrus, each sliver is stored on N nodes. The content is considered retrievable when at least F nodes serve the slivers, where F<N.
  • In Spark, we consider content retrievable only if the SP under test serves retrieval. We ignore copies stored with other SPs or even on IPFS nodes outside of Filecoin.

This difference makes it difficult to compare retrievability scores for different storage networks.

Let's add a new Spark score that measures how many deals (CIDs) can be retrieved from the network using any available retrieval provider, including non-Filecoin nodes running IPFS.

  • If a piece is stored with multiple SPs but only one of them serves retrievals, the new score should flag this content as retrievable. That matches the experience of retrieval clients: they wanted to retrieve a CID and they got back their content, all was good.
  • The new score will demonstrate the real-world benefits of content addressing and IPFS-based retrievals.
  • The new score can potentially double the observed RSR of data stored on Filecoin.

Notes:

  • This new score is useful as a network-wide metric only. It must not affect the current per-miner/per-client/per-allocator RSR metrics. We shouldn't even collect it with per-miner/per-client/per-allocator granularity.
  • This new check should be added to the existing Spark infrastructure, similarly to how we added HTTP HEAD retrieval checking (see Test HEAD requests before GET spark-checker#104)
  • Proposed algorithm:
    • If the current retrieval check passes, the outcome of the new check is "OK".
    • If the current retrieval check fails because of IPNI error (e.g. 404), the outcome of the new check is the same.
    • Only when the IPNI lookup fails with NO_VALID_ADVERTISEMENT, we want to try to retrieve the payload CId from all providers found in the IPNI lookup response. (Potentially de-duplicating entries from the same provider but with different protocols.)

Sub-issues

Metadata

Metadata

Assignees

Projects

Status

🧊 icebox

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions