Skip to content

Commit be57507

Browse files
author
Andrew Law
committed
Resolve Chester comments
1 parent 52bdeed commit be57507

File tree

1 file changed

+15
-7
lines changed

1 file changed

+15
-7
lines changed

docs/src/integrity/integrity.rst

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@ The integrity module of Opaque ensures that the untrusted job driver hosted on t
66
Opaque runs on Spark, which utilizes data partitioning to speed up computation.
77
Specifically, Catalyst will compute a physical query plan for a given dataframe query and delegate Spark workers (run on enclaves) to compute Spark SQL operations on data partitions.
88
Each of these individual units is trusted, but the intermediary steps in which the units communicate is controlled by the job driver, running as untrusted code in the cloud.
9-
The integrity module will detect if the job driver has deviated from the query plan computed by Catalyst.
9+
The integrity module will detect foul play by the job driver, including deviation from the query plan computed by Catalyst,
10+
shuffling data in an unexpected manner across data partitions, spoofing extra data between ecalls, or dropping output between ecalls.
1011

1112
Overview
1213
--------
13-
The main idea behind integrity support is to tag each step of computation with a MAC, attached by the enclave worker when it has completed its computation.
14-
All MACs received by all previous enclave workers are logged. In the end, these MACs are compared and reconstructed into a graph.
15-
This graph is compared to that computed by Catalyst.
14+
The main idea behind integrity support is to tag each step of computation with a MAC over individual enclave workers' encrypted output, attached by the enclave worker when it has completed its computation.
15+
All MACs received by all previous enclave workers are logged. In the end during post verification, these MACs, which each represent an ecall at a data partition, are compared and reconstructed into a graph.
16+
This graph is compared to the DAG of the query plan computed by Catalyst.
1617
If the graphs are isomorphic, then no tampering has occurred.
1718
Else, the result of the query returned by the cloud is rejected.
1819

@@ -22,8 +23,9 @@ Two main extensions were made to support integrity - one in enclave code, and on
2223

2324
Enclave Code
2425
^^^^^^^^^^^^
25-
In the enclave code (C++), modifications were made to the ``FlatbuffersWriters.cpp`` file.
26-
Attached to every output of an ``EncryptedBlocks``` object is a MAC over the output.
26+
In the enclave code (C++), modifications were made to the ``FlatbuffersWriters.cpp`` file and ``FlatbuffersReaders.cpp`` file.
27+
The "write" change attaches a MAC over the ``EncryptedBlocks`` object to the output.
28+
The "read" change checks whether all blocks that were output from the previous ecall were received by the subsequent ecall.
2729
No further modifications need to be made to the application logic since this functionality hooks into how Opaque workers output their data.
2830

2931
Scala/Application Code
@@ -57,4 +59,10 @@ This amounts to adding a case in the switch statement of this function.
5759

5860
Furthermore, add the logic to connect the ecalls together in ``linkEcalls``.
5961
As above, this amounts to adding a case in the switch statement of this function, but requires knowledge of how each ecall communicates the transfer of data partitions to its successor ecall
60-
(broadcast, all to one, one to all, etc.).
62+
(broadcast, all to one, one to all, etc.).
63+
64+
Usage
65+
^^^^^
66+
To use the Job Verification Engine as a black box, make sure that its state is flushed by calling its ``resetForNextJob`` function.
67+
Then, you can call ``Utils.verifyJob`` on the query dataframe, which will return a boolean indicating whether the job has passed post verification.
68+
It returns ``True`` if the job passed, else it returns ``False``.

0 commit comments

Comments
 (0)