Proofs

Overview

The Filecoin protocol uses cryptographic proofs to ensure the two following guarantees:

Storage Based Consensus: Miners’ power in the consensus is proportional to their amount of storage. Miner increase their power by proving that they are dedicating unique storage to the network.
Verifiable Storage Market: Miners must be proving that they are dedicating unique physical space for each copy of the clients data through a period of time.

We briefly describe the three main Filecoin proofs:

Proof-of-Replication (PoRep) proves that a Storage Miner is dedicating unique dedicated storage for each sector. Filecoin Storage Miners collect new clients’ data in a sector, run a slow encoding process (called Seal) and generate a proof (SealProof) that the encoding was generated correctly.

In Filecoin, PoRep provides two guarantees: (1) space-hardness: Storage Miners cannot lie about the amount of space they are dedicating to Filecoin in order to gain more power in the consensus; (2) replication: Storage Miners are dedicating unique storage for each copy of their clients data.

Proof of Spacetime proves that an arbitrary number of sealed sectors existed over a specified period of time in their own dedicated storage — as opposed to being generated on-the-fly at proof generation time.
Piece-Inclusion-Proof proves that a given piece is contained within a specified sealed sector.

Glossary

Throughout this document, the following definitions are used:

sector: a fixed-size block of data of SECTOR_SIZE bytes which generally contains clients’ data.
piece: a block of data of at most SECTOR_SIZE bytes which is generally is a client’s file or part of.
original data: the concatenation of a sector’s constitutent pieces, all piece padding, and any terminal padding.
unsealed sector: a concrete representation (on disk or in memory) of a sector’s original data.
sealed sector: a concrete representation (on disk or in memory) of the unique replica generated by Seal from an unsealed sector.
piece padding: a block of zero or more ‘zero bytes’ inserted between pieces to ensure they are positioned within the containing sector in a way compatible with the Piece Inclusion Proof.
terminal padding: a block of zero or more ‘zero bytes’ inserted after a sector’s final piece, ensuring that the length of the original data is SECTOR_BYTES.
preprocessing: a transformation applied to an unsealed sector as the first stage of sealing and which may increase the size of the data.
preprocessed data: the result of preprocessing the original data.
preprocessed sector: a concrete representation (on disk or in memory) of a sector’s preprocessed data.
SNARK proof: a block of bytes which proves that the creator is in possession of a satisfying assignment to a quadratic arithmetic program (circuit).
Multi-SNARK proof: a block of one or more SNARK proofs, each proving partition of the total set of challenges to be proved.
Merkle inclusion proof: a proof (whose format is unspecified here) that a given leaf block is contained within a merkle tree whose root is the commitment associated with the proof.
commitment: an opaque block of data to which a prover ‘commits’, enabling subsequent proofs which cannot be validly constructed unless the commitment itself was validly constructed. For example: the output of a suitable pseudorandom collision-resistant hash function may serve as a commitment to the data which is the preimage of that hash. Publication of the commitment proves that the creator was in possession of the preimage at the time the commitment was generated.
prover: the party who generates a proof, in Filecoin it’s always the Storage Miner.
verifier: the party who verifies a proof generated by a prover, in Filecoin it’s a full node.

The SectorID defined by the chain is au64, which gets encoded as a 31-byte array for the purpose of proofs. This transform consists of encoding the number to its little-endian byte representation and then zero-pad to 31-bytes.

Example:

uint64(1025) -> [1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

Proof-of-Replication Algorithms

Filecoin Proof of Replication generates a unique copy (sealed sector) of a sector’s original data, a Multi-SNARK proof, and a set of commitments identifying the sealed sector and linking it to the corresponding unsealed sector.

Seal

Seal has the side effect of generating a sealed sector from an unsealed sector, and returns identifying commitments and a Multi-SNARK proof. The proof returned is a Multi-SNARK proof.

The commitments are used to verify that the correct original data was sealed, and that the correct sealed data is the subject of later Proofs of Spacetime, proving that this data is being stored continuously.

Seal operates by performing a slow encoding of the unsealed sector — such that it is infeasible for a dishonest prover to computationally regenerate the sealed sector quickly enough to satisfy subsequent required Proofs of Spacetime — thus ensuring that the sealed sector remains manifest as a unique, concrete representation of the original data.

Seal
 (
  // request represents a request to seal a sector.
  partitions     uint64,      // influences the size of the output proof; using less partitions requires more hardware but produces shorter proofs
  sectorSize     uint64,      // the number of bytes in the sealed sector
  unsealedPath   string,      // path of unsealed sector (regular file, ramdisk, etc.) from which a unique replica will be created
  sealedPath     string,      // path to which sealed sector will be written
  proverID       [31]byte,    // uniquely identifies miner
  ticket         [32]byte,    // ticket to which miner commits when sealing begins
  sectorID       [31]byte,    // uniquely identifies sector
 ) err Error | (
  // response contains the commitments resulting from a successful Seal().
  commD          [32]byte,    // data commitment: merkle root of original data
  commR          [32]byte,    // replica commitment: merkle root of replicated data [will be removed in future iteration]
  commRStar      [32]byte,    // a hash of intermediate layers
  proof          []byte,
 )

VerifySeal

VerifySeal is the functional counterpart to Seal’s proof component. It takes all of Seal's outputs, along with those of Seal’s inputs which are required to uniquely identify the created sealed sector. This allows a verifier to determine whether a given proof is valid. All inputs are required because verification requires sufficient context to determine not only that a proof is valid but also that the proof indeed corresponds to what it purports to prove.

VerifySeal
 (
  // request represents a request to verify the output of a Seal() operation.
  commD       [32]byte, // returned from Seal
  commR       [32]byte, // returned from Seal [will be removed in future iteration]
  commRStar   [32]byte, // returned from Seal
  proof       []byte,   // returned from Seal
  proverID    [31]byte, // uniquely identifies miner
  ticket      [32]byte, // ticket to which miner committed when sealing began
  sectorID    [31]byte, // uniquely identifies sector
) err Error |
  IsValid bool          // true iff the provided proof-of-replication os valid

Unseal

Unseal is the counterpart to Seal’s encoding side-effect. It reverses the ‘slow encoding’ and creates an unsealed sector from a sealed sector as a special case of its more general function. In general, it allows extraction of a range of bytes (specified in terms of the layout of the original data).

Unseal
 (
  // request represents a request to unseal a sector.
  sectorSize    uint64,   // the number of bytes in the sealed sector
  sealedPath    string,   // path from which sealed bytes will be read
  outputPath    string,   // path to which unsealed bytes will be written (regular file, ramdisk, etc.)
  proverID      [31]byte, // uniquely identifies miner
  sectorID      [31]byte, // uniquely identifies sector
  ticket        [32]byte, // ticket to which miner committed when sealing began
  startOffset   uint64,   // zero-based byte offset in original, unsealed sector-file
  numBytes      uint64,   // number of bytes to unseal (corresponds to contents of unsealed sector-file)
 ) err Error |
  NumBytesWritten uint64  // the number of bytes unsealed (and written) by Unseal()

Security Notes

Guaranteeing sector uniqueness

Every sealed sector is unique, even if the unsealed data is identical. This prevents a malicious miner from storing the same sector twice without dedicating twice the amount of storage, or two malicious miners pretending to store the same sector, but only storing one copy. Sector uniqueness is guaranteed by having unique proverId and sectorId. Each miner has a unique proverID, and each sector has a unique sectorIDwithin that miner’s sectors. Taken together, proverID and sectorID are globally unique . Both the proverId and the sectorId are used to encode the sealed data.

The Filecoin node verifies that the correct proverId and sectorId is used when verifying the proof.

Piece Inclusion Proof

A PieceInclusionProof contains a potentially complex Merkle inclusion proof that all leaves included in commP (the piece commitment) are also included in commD (the sector data commitment).

struct PieceInclusionProof {
    Position uint,
    ProofElements [32]byte
}

GeneratePieceInclusionProofs

GeneratePieceInclusionProofs takes a merkle tree and a slice of piece start positions and lengths (in nodes), and returns a vector of PieceInclusionProofs corresponding to the pieces. For this method to work, the piece data used to validate pieces will need to be padded as necessary, and pieces will need to be aligned (to 128-byte chunks, after padding, due to details of preprocessing) when written. This assumes that pieces have been packed and padded according to the assumptions of the algorithm. For this reason, practical implementations should also provide a function to assist in correct packing of pieces. All pieces will be zero-padded such that the total length of the piece is a multiple of 127 bytes before preprocessing and a multiple of 128 bytes after pre-processing.

GeneratePieceInclusionProofs
 (
  Tree MerkleTree,
  PieceStarts []uint
  PieceLengths []uint,
 ) []PieceInclusionProof

GeneratePieceInclusionProofs takes a merkle tree, an array of the index positions of the first nodes of the pieces, and an array of the corresponding piece lengths. It returns an array of PieceInclusionProofs corresponding to the supplied start/length pairs — each of which specifies a piece as a sequence of leaves ([start..start+length]) of Tree.

GeneratePieceInclusionProof
 (
  tree          MerkleTree,
  firstNode     uint,
  pieceLength   uint,
 ) err Error |  proof PieceInclusionProof

The structure of a PieceInclusionProof is determined by the start position and length of the piece to be proved. (Note that if start + length is greater than the number of leaves in Tree, and no PieceInclusionProof can be generated since these parameters do not specify a valid piece.)

The form of a PieceInclusionProof is as follows: - The piece’s position within the tree must be specified. This, combined with the length provided during verification completely determines the shape of the proof. - The remainder of the proof is a series of ProofElements, whose order is interpreted by the proof algorithm and is not (yet: TODO) specified here. The significance of the provided ProofElements is described by their role in the verification algorithm below.

VerifyPieceInclusionProof takes a sector data commitment (commD), piece commitment (commP), sector size, and piece size. Iff it returns true, then PieceInclusionProof indeed proves that all of piece’s bytes were included in the merkle tree corresponding to commD of a sector of sectorSize. The size inputs are necessary to prevent malicious provers from claiming to store the entire piece but actually storing only the piece commitment.

VerifyPieceInclusionProof
 (
  proof PieceInclusionProof,
  commD  [32]byte,
  commP [32]byte,
  sectorSize uint,
  pieceSize uint,
 ) err Error | IsValid bool // true iff the provided PieceInclusionProof is valid.

The abstract verification algorithm is described below. Only the algorithm for an aligned PieceInclusionProof is fully specified here. [TODO: provide details fully specifying the ordering of ProofElements in the general case.]

A PieceInclusionProof includes a start position and a sequence of ProofElements. Verification of a PieceInclusionProof is with respect to a given commP, commD, sector size, and piece size.

Piece size and start position are used, as in proof generation, to determine the shape of the proof. This shape fully determines the inputs to and order of applications of RepCompress which constitute the proof.

Proof verification is as follows: - A ProofElement is a 32-byte value which will be supplied as either the left or right input to RepCompress (along with an appropriate height) to combine with a complementary (right or left) ProofElement already known to the verifier. - Each time RepCompress is called on a pair of known ProofElements, a new ProofElement is considered to be known to the verifier. - Initially, the verifier knows only of two ProofElements, commP (the Piece Commitment) and commD (the Data Commitment). - The proof proceeds in two stages: 1. Zero or more candidate ProofElements are proved to hash to commP through subsequent applications of RepCompress. Only after commP has been constructed from a set of candidate ProofElements do those ProofElements become eligible for use in the second phase of the proof. (Because RepCompress takes height as an input, only ProofElements from the same height in either the piece or data tree’s can be combined. The output of RepCompress is always a ProofElement with height one greater than that of its two inputs.) - commP itself is by definition always eligible for use in the second proof phase. - An aligned PieceInclusionProof is one whose start index is a power of 2, and for which either its piece’s length is a power of 2 or the piece was zero-padded with Piece Padding when packed in a sector. In these cases, commP exists as a node in the data tree, and a minimal-size PieceInclusionProof can be generated. - In the case of an aligned PieceInclusionProof, zero candidate ProofElements are required. (This means that commP is the only eligible ProofElement.) 2. Provided ProofElements are added to the set of eligible ProofElements by successive application of RepCompress, as in the first phase. - When commD is produced by application of an eligible ProofElement and a ProofElement provided by the proof, the proof is considered complete. - If all eligible ProofElements have been used as inputs to RepCompress and are dependencies of the final construction of commD, then the proof is considered to be valid.

NOTE: in the case of an aligned PieceInclusionProof the ProofElements take the form of a standard Merkle inclusion proof proving that commP is contained in a sub-tree of the data tree whose root is commD. Because commP’s position within the data tree is fully specified by the tree’s height and the piece’s start position and length, the verifier can deterministically combine each successive ProofElement with the result of the previous RepCompress operation as either a right or left input to the next RepCompress operation. In this sense, an aligned PieceInclusionProof is simply a Merkle inclusion proof that commP is a constituent of commD’s merkle tree and that it is located at the appropriate height for it to also be the root of a (piece) tree with length leaves.

In the non-aligned case, the principle is similar. However, in this case, commP does not occur as a node in commD’s Merkle tree. This is because the original piece has been packed out-of-order to minimize alignment padding in the sector (at the cost of a larger PieceInclusionProof). Because commP does not exist as the root of a data sub-tree, it is necessary first to prove that the root of every sub-tree into which the original piece has been decomposed (when reordering) is indeed present in the data tree. Once each candidate ProofElement has been proved to be an actual constituent of commP, it must also be shown that this eligible ProofElement is also a constituent of commD.