WO2008147400A1 - Authentication for operations over an outsourced file system stored by an untrusted unit - Google Patents

Authentication for operations over an outsourced file system stored by an untrusted unit Download PDF

Info

Publication number
WO2008147400A1
WO2008147400A1 PCT/US2007/024642 US2007024642W WO2008147400A1 WO 2008147400 A1 WO2008147400 A1 WO 2008147400A1 US 2007024642 W US2007024642 W US 2007024642W WO 2008147400 A1 WO2008147400 A1 WO 2008147400A1
Authority
WO
WIPO (PCT)
Prior art keywords
digest
file system
proof
tree structure
hash value
Prior art date
Application number
PCT/US2007/024642
Other languages
French (fr)
Inventor
Roberto Tamassia
Michael T. Goodrich
Nikolaos Triandopoulos
Charalampos Papamanthou
Original Assignee
Brown University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brown University filed Critical Brown University
Publication of WO2008147400A1 publication Critical patent/WO2008147400A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity

Definitions

  • the exemplary embodiments of this invention relate generally to data authentication and, more specifically, relate to authentication for operations over an outsourced file system stored by an untrusted unit.
  • One consideration is that of authenticating an outsourced file system in a setting where data resides at remote storage units of untrusted host machines, outside of any administrative control. It is generally desirable to efficiently (e.g., with logarithmic complexity) verify the integrity of a dynamic file system, namely to verify that its status is consistent with the history of file-system operations ordered by a client, and correctly detect any malicious access or data-retrieval patterns by the server.
  • one goal is to verify the directory hierarchy of the file system, an important task, since, in many cases, the integrity of a file depends not only on its content, but also on its location in the file system. For example, the context of an .htaccess file depends on its location - its
  • One conventional technique is to have the client (which can abstract to an operating system (OS) kernel supporting many users) sign each file system update it makes in the outsourced file system (e.g., using a hashed message authentication code (HMAC) based on a key that it keeps secret from the server).
  • HMAC hashed message authentication code
  • This technique has some drawbacks, however. First, it allows for replay attacks since determining file freshness is difficult with such a system. Second, this technique requires the client to sign every possible path in the directory hierarchy in order to be able to authenticate locations. This last requirement is especially inefficient, for example, when the client performs the directory operation mv that moves a large directory to a new location.
  • Another technique is to assume that the outsourced file system is partially trustworthy or has some tamper- resistant trusted hardware as a part of its architecture (e.g., using trusted computing platforms). Such an assumption involves postulating that the networked file system is itself at least partially trusted, which is not that much different than simply trusting the hosting server in the first place.
  • a method for a client unit to interact with a file system stored by an untrusted server unit includes: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure; issuing to the untrusted server unit an operation to be performed on the file system; and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit.
  • an apparatus in another exemplary embodiment of the invention, includes: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor.
  • a method includes: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system; receiving from a client unit an instruction to perform an operation on the file system; and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
  • an apparatus in another exemplary aspect of the invention, includes: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
  • FIG. 1 shows a schematic illustration of an exemplary authenticated storage model within which exemplary embodiments of the invention may be utilized
  • FIG. 2(a) depicts an exemplary skip-list hashing scheme for verifying operations on a map data structure
  • FIG. 2(b) illustrates an exemplary consistency proof P for the exemplary skip-list hashing scheme of FIG. 2(a);
  • FIG. 3 (a) shows an exemplary tree structure T corresponding to an exemplary file system;
  • FIG. 3(b) depicts an exemplary tree of paths T " corresponding to the exemplary tree structure T shown in FIG. 3(a);
  • FIG. 4 illustrates a simplified block diagram of various exemplary electronic devices that are suitable for use in practicing the exemplary embodiments of this invention
  • FIG. 5 depicts a flowchart illustrating one non-limiting example of a method for practicing the exemplary embodiments of this invention.
  • FIG. 6 depicts a flowchart illustrating another non-limiting example of a method for practicing the exemplary embodiments of this invention.
  • Authenticated Data Storage Systems Previous and related work on authenticated data storage considers integrity at the file or data block level. Most of the systems provide file integrity using authentication information at the client that is proportional to the size of the file system. The most efficient constructions generally involve the use of Merkle trees over the data blocks of a single file. SUNDR system provides integrity protection by employing hash-tree schemes and digital signatures. However, it makes heavy use of signatures (e.g., every operation is signed). SUNDR does not assume the existence of a secure module (i.e., client) and operates in a fully-distributed setting, thus it suffers from consistency limitations inherent in this multi-client model. Similarly, SiRiUS system stores a digital signature for each file.
  • an efficient authentication scheme of large, dynamic data sets using Galois/Counter Mode is described where a constant amount of memory is used.
  • efficient protocols for proving availability of static data in remote untrusted storage units are presented.
  • a recent technique for authenticated network storage proposes to use a Merkle tree as the underlying data structure; however, PKI is used and the hash root is outsourced to an external medium, raising communication as well as security issues.
  • the trusted digest is remotely maintained by the client at the server side, assuming a trusted storage component.
  • authenticated storage is memory checking, where a trusted checker checks the correctness of an untrusted memory, using cryptographic primitives and hashing primitives.
  • the model of authenticated data structures studies cryptographic techniques for authenticating that data structures that reside at untrusted hosts operate reliably with respect to updates performed by the owner of the data and queries issued by the users of the data structure. Techniques are known for authenticating the results of different types of queries, including set-membership queries, SQL queries on databases, geometric queries, and XML queries.
  • authenticated data structures are closely related to the authenticated storage problem, there exist important differences in the models used. Authenticated data structures use a three-party data verification model and the data source stores the entire data set. In contrast, outsourced data authentication (i.e., the authenticated storage problem) uses a two-party model and the data source stores only minimal information about the data, and consequently, data consistency and replay-attack safety are more challenging to meet.
  • the problem of authenticated storage is considered, where one desires to outsource a file-system to an untrusted server and yet ensure the file-system's integrity.
  • New exemplary architectures for authenticated outsourced storage are introduced. Using light-weight cryptographic primitives and efficient data structuring techniques, the exemplary authentication schemes allow a client to verify that the file-system is fully consistent with the history of updates or queries requested by the client. File-system operations are verified in time that is logarithmic in the size of the file-system using optimal storage complexity, constant storage overhead at the client and asymptotically no extra overhead at the server.
  • exemplary schemes described herein additionally verify the file-system directory structure, thus supporting the authentication of complex file-system operations (e.g., directory moves and navigation) and file-system meta-data (e.g., general directory attributes).
  • the exemplary architecture achieves generality by being platform-independent, as well as usability by operating transparently for end-users.
  • Various exemplary embodiments of the invention may be referred to herein as "Athos" (AuTHenticated Outsourced Storage), supporting an authenticated networked file system that allows for efficient verification of, for example, the integrity of contents and locations of files and directories.
  • provably secure protocols are constructed for authenticating file-system operations.
  • the client maintains some minimal cryptographic state (a "digest") that consistently represents the file-system, against which any file-system operation performed by the server can be verified to determine whether it has been executed correctly.
  • a digest some minimal cryptographic state
  • an efficient data structuring technique is employed for representing an entire file system in a way that facilitates meeting verification and efficiency goals.
  • Novel, exemplary techniques are presented for achieving consistency and file-system integrity verification: any update or query is validated at the client by having the server provide a succinct corresponding proof (e.g., through an authentication service module that runs in the untrusted memory and is, thus, also untrusted), which comprises, for example, partial data and hashes stored in the file-system data structure.
  • a succinct corresponding proof e.g., through an authentication service module that runs in the untrusted memory and is, thus, also untrusted
  • a succinct corresponding proof e.g., through an authentication service module that runs in the untrusted memory and is, thus, also untrusted
  • a succinct corresponding proof e.g., through an authentication service module that runs in the untrusted memory and is, thus, also untrusted
  • various conventional techniques provide: (a) O( 1 ) client storage, 0 ⁇ ri) update complexity and no hierarchy authentication; (b) O(log ⁇ ) client storage, O(log ⁇ ) update complexity and no hierarchy authentication; (c) O(n) client storage, O(log ⁇ ) update complexity and no hierarchy authentication; and/or (d)
  • Athos provides improvements over these techniques by enabling O(I) client storage, O(log ⁇ ) update complexity and hierarchy authentication for verification purposes.
  • FIG. 1 shows a schematic illustration of an exemplary authenticated data storage model 10 within which exemplary embodiments of the invention may be utilized. Owned by a client C 12 but hosted at a remote untrusted server S 14, a file system FS 16 evolves over time through a series of update and query operations, issued by C 12 and executed by S 14. At all times, C 12 stores a succinct state 20 (e.g., a digest of few bytes) of FS 16 that is consistent with the entire history of operations. Authentication of operations is performed through verification or consistency proofs that are provided to C 12 (along with any answers to the operations provided by FS 16) by an authentication service module 18 that runs in memory independently of FS 16 and is controlled by S 14. A proof is used by C 12 to verify the current operation and consistently update the state (digest). That is, the authentication service module 18 stores additional authentication information 22 about FS 16. The file system may be generated and queried through the series of update and query operations.
  • a succinct state 20 e.g., a
  • the client C 12 sends a query x 24 on the FS 16 to the untrusted server 5 14.
  • the server S 14 performs the query operation and obtains an answer a 26.
  • the authentication service 18 generates a proofs 28.
  • the answer a 26 and the proof >> 28 are sent to the client C 12.
  • the answer a 26 and the proof jy 28 are sent together (e.g., in a single message or transmission) from the server S 14 to the client C 12.
  • FIG. 1 the answer a 26 and the proof jy 28 are sent together (e.g., in a single message or transmission) from the server S 14 to the client C 12.
  • the query x 24 may be received once by the server S 14, with the server S 14 internally sending the query x 24 to both be processed (i.e., to obtain the answer a 26) and to generate the proofs 28.
  • the specific functionality of the query x 24, the answer a 26 and the proofy 28 are explained in further detail below.
  • an exemplary communication protocol is as follows:
  • Client C keeps state information s and issues a query or update operation o D O to the server S.
  • Server S performs the query or update operation o by accordingly answering the query (i.e., obtaining an answer) or updating the FS to a new version FS', and by running an authentication service (AS), S generates a verification or respectively a consistency proof n (generally referred to herein as a "proof) which is returned to client C, along with the result p of the operation; p is the corresponding answer if operation o is a query or the
  • This set of operations can be represented as a verify
  • the state s is not updated to state s'.
  • an error message or other indication of the failed verification may be output.
  • the above protocol and pair of algorithms may be considered an authenticated storage scheme.
  • the security requirement such a scheme should satisfy expresses the intuitive property that the verification performed at C is a reliable test for the file system's integrity. Let operate ⁇ , • ) be the algorithm that, given the current file
  • an authenticated storage scheme is time-efficient if the verification time is sub-linear in the file-system size ⁇ FS ⁇ .
  • An authenticated storage scheme is space- efficient if the state stored by the client C is sub-linear in ⁇ FS ⁇ or space-optimal if the state is of constant size.
  • Section 4 an exemplary lime-efficient, space-optimal and secure authenticated storage scheme is presented for a rich set of operations on an outsourced file system.
  • a hashing scheme for a certain query type Q describes a systematic method for computing a digest from an underlying data set (e.g., a file system) by hierarchically applying a cryptographic hash function (e.g., that is collision-resistant) over data objects and previously produced hash values.
  • Hashing schemes produce digests satisfying an important property: answers to queries in Q on the data set define sequences of hash values produced by the hashing scheme that can serve as proofs of the answers' correctness, subject to the (correct and authentic) data digest.
  • the inspiration stems from authenticated data structures, where a data structure produced by a trusted source is replicated to an untrusted server for answering users' queries.
  • authenticated e.g., signed
  • query verification is possible at/by the users.
  • the exemplary authenticated storage models presented herein can be seen as the model of authenticated data structures, where the source and the user coincide. But the relation stops here, because exemplary models for the invention require that the client stores only a single digest, that protocols satisfy a stronger notion of security, and because there are no known efficient hashing schemes for general file-system operations.
  • Exemplary hashing schemes are discussed that satisfy three important properties: (i) they are specialized for verifying a rich set of file-system operations; (ii) they define the file- system digest so that it not only encodes information that provides proofs for verifying the results of queries, but also encodes information that provides proofs for verifying updates performed in the file system; and (iii) the proof of any operation has size that is logarithmic in the size n of the entire file system and after any update the new digest can be computed with logarithmic in n cost.
  • An exemplary hashing scheme is maintained at the authentication service module (AS) of the server. The following invariant is maintained: the client maintains the correct digest of the current file system (as if it was computed by the client).
  • any query operation can be verified by having the (untrusted) AS provide the client with the corresponding (verification) proof.
  • Any update operation can be verified by designing the hashing scheme to include information that can be used to check the file system's integrity after an update, and by having the AS provide the client with this information as a (consistency) proof for the verification algorithm verify.
  • the verification can be divided, conceptually, into two steps: (1) first, the provided information is itself verified to be authentic subject to the existing state (i.e., consistent digest before the update), and (2) second, the (authenticated) information is used by the client to verify the updated file system (i.e., after the update is performed) and compute the new state (digest), which is now consistent with this update since it is computed using authentic data.
  • the consistency proof is such that the client is able to locally rerun the same update over the hashing scheme and the data structure and thus validate the new, updated state of the file system. Proof Sketch of Security.
  • the exemplary authenticated storage schemes are based on the following general approach.
  • H corresponds to an underlying authenticated data structure ADS
  • H one augments H to a new hashing scheme H, such that H additionally encodes in its (defining) digest the entire structural and balancing information that exists in ADS.
  • h v h(h U ⁇ ,..., h Ui , h(b v , s v ))
  • b v , s v describe all the balancing and respectively structural information about node v in the data structure.
  • the query types are set-membership or path property queries using the skip-list and the dynamic-tree data structure, respectively, and the corresponding hashing schemes of Goodrich 1 and Goodrich2 (see below), respectively.
  • the consistency proof by the definition of the corresponding augmented hashing scheme H contains all the balancing and structural information that completely characterizes the changes in FS due to the update. Assuming that the state is consistent, the consistency proof coming from an honest server S will be verified, thus also providing verification of the balancing and structural information related to the update.
  • C is able to locally perform the correct update as if C had direct access to the entire, (correct) current file system FS.
  • C is able to correctly and consistently update the state s to s', which is simply the new digest according to H.
  • any query can be securely verified since the underlying hashing scheme is secure.
  • any malicious behavior by S will be rejected by the verification algorithm, assuming that finding collisions is computationally difficult.
  • Each entry of the map is a tuple (k, v), where k is a key and v is the value that corresponds to k; v can be a collection of objects as well.
  • the entries of the map are sorted according to their keys (e.g., by using a comparator).
  • the authenticated map data structure resides in the server. Using a hashing scheme designed over skip-lists, one can define the digest for the authenticated map, computed according to the tree structure of the skip-list (see Figure 2(a) as discussed below).
  • FIG. 2(a) depicts an exemplary skip-list hashing scheme 30 for verifying operations on a map data structure and FIG. 2(b) illustrates an exemplary consistency proof P 50 for the exemplary skip-list hashing scheme 30 of FIG. 2(a).
  • the exemplary skip-list hashing scheme 30 shown in FIG. 2(a) has a number of entries, each one having a tuple (k, v).
  • FIG. 2(a) illustrates the insertion of key 14.
  • the exemplary consistency proof P 50 shown in FIG. 2(b) is returned by S in response to the update operation (i.e., to insert key 14).
  • the proof P 50 contains all the hashing and structural information needed to verify the consistency of P 50 subject to the current digest (i.e., before the update) and to locally perform the update and generate a new digest corresponding to the updated skip list.
  • P contains the two keys, for example, succ(x) and pred(x), that are the successor and predecessor of x in the ordering of the keys.
  • P also contains all the necessary hashing information (e.g., hash values) that allow C to recompute the digest do starting from succ(x) and pred(x) and hashing according to the hashing scheme that is used. Due to the collision-resistance property of the hash function, C can tell if the received path is the correct one. IfP is verified, C verifies that key x is not in the directory. Also, C knows the position at which this file should be added.
  • hashing information e.g., hash values
  • P contains all the necessary structural information that enable C to locally perform the update in the hashing scheme that corresponds to the file insertion, by placing x between succ(x) and pred(x) and computing the new hash values for only those nodes of the skip list that need a new hash. Knowing the new hash values, C can compute the new digest d'o , which is consistent with the insertion operation.
  • the key insertion (performed by S) can be verified in two steps: (1) first, path P is verified and then (2) it is used to locally perform the update and compute the new digest.
  • Lemma 3.1 There exists an authenticated storage scheme for operations on key-value pairs in a map that is based on an authenticated skip list, with the following expected complexity bounds:
  • the expected update (insertion and removal), query and verification time is 0(log «) with high probability.
  • Update time is the time required by S to do the actual update
  • query time is the time S needs to compute the (consistency or verification) proof
  • verification time is the time that C needs in order to process the proof and validate or reject the query or the update. Note that for set-membership queries and updates (e.g., through which tone can implement all file system operations) the size of a proof is asymptotically equal to the verification time; therefore, the verification time bounds will indirectly imply the size of the proof.
  • Tbe the tree that corresponds to the file system.
  • Each entry of the map corresponds to a node v of T and has the following format:
  • key(v) is the key of the specific entry, a unique id for each node of the file system (e.g., the i-node of the file system node, obtainable in UNIX by using the Stat command);
  • key(parent) is the key that corresponds to the parent node of v in the file system
  • key(sibling) is the key of the node that corresponds to the sibling of v according to the order of their creation (e.g., the first child of a node is considered to be that node that was created most recently). Note that if a node v is the last node of the children list, then this field, for example, may be null;
  • key(backsibling) is the key of the node that corresponds to the sibling just before v according to the order of their creation. Note that if a node v is the first node of the children list, then this field, for example, may be null;
  • key(child) is the key of the node that corresponds to the first child of v in the above described order.
  • each query/update to the file system is mapped to a standard query/update operation in the authenticated map.
  • Theorem 3.2 (Representation with Skip Lists) Assuming the existence of collision- resistant hash functions, there exists a secure and space optimal authenticated storage scheme that is implemented with skip lists and achieves the following performance, where n is the size of the file-system:
  • n be the path one wants to authenticate.
  • the id key( ⁇ ) of the node ⁇ e.g., this can be done by using the Stat command in UNIX.
  • One issues the query contains(key( ⁇ /t )).
  • the authenticated query takes time O(k log ri), where one issues k queries to the skip list.
  • O(k log ri) For the operation cd(n),only the path FI has to be authenticated, hence the bound follows.
  • the path FI is authenticated and it is checked to see if the field file of ⁇ * equals the respective cryptographic hash of what is being read.
  • the operation Is(FI) is basically an authentication of the path FI and then one has to follow sibling relations to check that what is being retrieved from the file system (in some order) by executing Is is equal to the authenticated information that one gets from the skip list. Hence one needs time O((k+t) log n).
  • the operations mkdir(FI), touch(FI) first the path ⁇ i ⁇ 2 . . . ⁇ i is authenticated.
  • a new id x is created for the new node ⁇ * (e.g., this can be done by actually creating the path in the file system and then calling Stat to get the i-node) and then the pointers are updated accordingly. Then the record with key x is inserted, as created above (with the updated pointers). Since only a constant number of pointers are updated, the complexity bound follows. Similarly, for the operations write(Fl) and rm(FI)the path FI is authenticated. Let x be the i-node of node ⁇ *.
  • Another exemplary method of representing the file system using a skip list is the following. Instead of storing the i-node number for the key of a node v, one can use as key the name of the path from the root to node v (for example, the key for the file lying in
  • /users/user/pub.txt will be the string "/users/user/pub.txt").
  • T be the (generally unbalanced) tree that represents the file system according to the directory hierarchy, where the topological (left-to-right) ordering of sibling nodes is also the lexicographical ordering of the corresponding files and directories.
  • the leaves of T are either files or empty directories.
  • the data staicture is based on dynamic trees.
  • FIG. 3(a) shows an exemplary tree structure T 70 corresponding to the file system.
  • FIG. 3(a) shows an exemplary tree structure T 70 corresponding to the file system.
  • FIG. 3(b) depicts an exemplary tree of paths 1 T 90 corresponding to the exemplary tree structure T 70 shown in FIG. 3(a). Nodes that belong to the dashed paths are duplicated.
  • Dynamic Trees and T The construction of the dynamic tree data structure from 'a tree T is briefly described. Tree T is transformed to a tree of paths "Tas follows. Paths in 'Tare defined by a path partition in the original file-system tree. 7Ms a rooted tree and its edges are classified as being either solid or dashed according to their weight in T (e.g., size of the subtree in T rooted at the lowest node of the edge), such that any internal node has at most one child connected by a solid edge.
  • This edge classification partitions the nodes of the tree into solid paths connected with each other by dashed edges (see FIG. 3(a)). Every internal node v in Thas at most one child u connected through a solid edge. If v has other children (through dashed edges), say nodes u ⁇ , . . . ,U k , then the dashed path d(v) of v is a path of length k such that there is a one-to-one correspondence between edges (u,, v) in T and nodes of d(v) and the ordering is preserved.
  • the tree TOf paths is constructed by considering all solid and dashed paths defined for tree T and defining the parent-child relation according to their connectivity in the original tree T.
  • solid paths are the parent paths of the dashed paths they define in T and dashed paths d(v) are the parent paths of the solid paths whose nodes are descendants in T of node v.
  • each path in Tis represented as a biased binary tree (weight-balanced tree) using appropriate weights and if these individual trees are appropriately interconnected, then it is possible to obtain a final tree T that is balanced.
  • any two nodes in the original tree T are connected in the final tree T through a path of logarithmic in ⁇ T ⁇ .
  • there are efficient algorithms in T for performing structural updates in the original tree for instance, any subtree in T can change parent in logarithmic time in ⁇ T ⁇ .
  • the tree T is used as the representation of the file system and also as the structure that will be used as a hashing scheme for defining the digest of the entire file system.
  • This hashing scheme should be appropriately constructed so that it can be used to verify a broad class of query and update operation on the file system.
  • the hashing scheme is used for authentication of path properties in trees. This hashing scheme is defined over trees of the form of the final tree T and has the following important property: given two nodes in the original tree T, the hashing scheme can be used to efficiently authenticate any "property" of the path connecting the nodes in T.
  • This exemplary hashing scheme is extended to authenticate path properties not only for paths in the original tree T (i.e., properties of paths related to the parent-child relation), but also for dashed paths in the intermediate tree T (i.e., properties of paths related to siblings).
  • This extension is performed by including in the hashing scheme information that is associated with the nodes of dashed paths, i.e., information associated with the files and subdirectories of any directory.
  • the exemplary hashing scheme can be augmented to include structural and balancing information related to T: in such a case the hash value of any node in T includes, for example, its sibling rank and weight.
  • each node v of the tree is related with a constant-size set of node attributes ⁇ N ⁇ (v), . . . , N*(v) ⁇ .
  • these can be the weight of v or other variables that one desires to relate with the node.
  • the set of these node attributes may be referred to as the node property 5V(v) of this node.
  • the node property 5V(v) of a node v to contain at least two attributes: S(v) and C(v).
  • every path /? is related with a set of path attributes ⁇ P ⁇ (p), ⁇ ⁇ ⁇ , P k ip) ⁇ - As non-limiting examples, these can be the length of a path or other variables that one would like to relate with this path.
  • the set of these path attributes may be referred to as the path property ⁇ P(p) of this path.
  • Theorem 4.1 (Representation with Dynamic Trees) Assuming the existence of collision-resistant hash functions, there exists a secure, time efficient and space optimal authenticated storage scheme that is implemented with dynamic trees and achieved the following performance, where n is the size of the file-system:
  • the authentication of any path FI is a query of the name of the path, namely a query of the property of the path, which according to Goodrich2 takes time O(log n+k).
  • Is(TT) one queries for the name of the dashed path d(nii) that corresponds to node ⁇ *- (the names of the children of n ⁇ ).
  • the query and verification time is O(log n+t+k).
  • Cd(FI) all one has to do is to query for the name of the path from ⁇ i to ⁇ (e.g., query for the attribute S(-)). This has query and verification time O(log n+k).
  • Operation mkdir(n) corresponds first to a cd operation (to authenticate the path FI) and then to the series of update operations (e.g., newTree(), link()) in the tree. These operations take time O( ⁇ ogn+k), O(I) and O(log n) respectively. Hence, the total time is O(log n+k).
  • operation rmdir(FI) corresponds first to a cd operation and then to the series of update operations (e.g., cut(), destroy! " ree()) in the tree.
  • the exemplary authentication scheme can provide a verification proof for negative answers by proving the existence of the two neighboring sibling nodes in a dashed path where the error occurs. In essence, this is, again, a path-property of a special type.
  • Table 2 a comparison of the three presented implementations, local skip lists, global skip lists and dynamic trees, for various file system operations is shown.
  • n is the size of the file system
  • C is the size of the children list
  • T is the subtree rooted on ⁇ *. Note that the dynamic trees clearly outperform skip lists in comparatively "expensive" operations such as mkdir and mv.
  • the client should be able to verify the correctness of the operation and to update the digest of the whole data structure by using the consistency proof sent by the server whenever an update takes place. But how does the client update the digest after an update in the dynamic trees case?
  • the server sends a path P with hash values and other information related to the update.
  • the consistency proof is generally more complex.
  • One exemplary consistency proof contains all the structural and balancing information and all the node and path attributes of the nodes that should be accessed by the update algorithm in order to perform the operation. Note that this information is included in the hashing scheme.
  • the client has all the information that, once authenticated using the current state (digest) is required for locally performing the update and computing the new digest.
  • this consistency proof has logarithmic size: since all the update operations described above take logarithmic time, they cannot visit more than O(log w+
  • the server can send structural and hashing information of size O(log «+
  • communication assumptions could be made. Briefly consider how the exemplary protocols can be applied in two such restricted multi-user settings, by making a general — and easy to meet in practice — communication assumption. First, if one assumes that different users belong in the same organizational unit and access a remote file system through the same network infrastructure, then the exemplary protocols are applicable to a single designated client, trusted by the users, which serializes all users' operations, verifying each operation locally. For instance, this may be the setting in a networked file system where many users share and operate on files that can be physically stored in remote and untrusted storage units, yet all users' requests are serialized in the system's filer.
  • an exemplary verification client can constitute an add-on module of the hosting operating system kernel that runs in parallel with the system's filer. Also, if one assumes that different users are online from geographically remote - and even mobile - locations but can share a trusted storage of constant size, then the exemplary protocols are applicable by simply having the authentication digest of the file system be stored in this shared storage unit (and by possibly enforcing certain locking mechanisms for achieving concurrency), where verification of operations are performed locally at/by the users.
  • users may share a secure web page or a file that is stored at (e.g., a single, trusted node of) a p2p storage network (e.g., accessed even by an untrusted node of it and using secure p2p searching techniques).
  • a p2p storage network e.g., accessed even by an untrusted node of it and using secure p2p searching techniques.
  • Athos can provide to the higher (or hosting) application complete information about the problematic operation and the current state of the file system in terms of its integrity.
  • Athos functionality can characterize the exact location in the file system where integrity was not verified and thus pinpoint which file or directory was maliciously (or accidentally) modified by the untrusted server or by the remote storage devices.
  • Athos can offer persistent authentication capabilities, where file-system operations or queries about past views of the file system can be issued and authenticated. This property of Athos may be significant, since it can be useful for supporting a secure audit of the entire outsourced file system.
  • the exemplary embodiments of the invention generally relate to interactions between a user and an untrusted server.
  • the exemplary embodiments of this invention may be implemented by one or more of the parties involved.
  • the user may comprise an electronic device or a portable electronic device.
  • Such an electronic device may itself comprise at least one data processor, at least one memory, a communication component (e.g., a transceiver), and a user interface comprising a user input (e.g., mouse, keyboard, keypad, joystick, touchscreen, touchpad) and a display device (e.g., display, monitor, screen, touchscreen, liquid crystal display).
  • the user may comprise a software program or a plug-in application attached to another program.
  • the server may comprise a web service running on a distributed collection of computers accessible via the internet.
  • a cryptographic component may be employed.
  • the cryptographic component may be a separate entity (e.g., an integrated circuit, an
  • a system implementing the exemplary embodiments of this invention may comprise a private network (e.g., local area network - LAN), a public network (e.g., a publicly available wireless local area network - WLAN), or the internet.
  • a private network e.g., local area network - LAN
  • a public network e.g., a publicly available wireless local area network - WLAN
  • the exemplary embodiments of this invention may be carried out by computer software implemented by a data processor or by hardware, or by a combination of hardware and software.
  • the exemplary embodiments of this invention may be implemented by one or more integrated circuits.
  • FIG. 4 illustrates a simplified block diagram of various exemplary electronic devices that are suitable for use in practicing the exemplary embodiments of this invention.
  • FIG. 4 shows a system 400 having a client 402 and a server 412.
  • the client 402 has a data processor (DP) 404, a memory (MEM) 406 coupled to the DP 404 and a transceiver (TRANS) 408 coupled to the DP 404.
  • the TRANS 408 enables bidirectional communication with the server 412.
  • the MEM 406 stores a digest 410 in accordance with exemplary embodiments of the invention, as further described herein.
  • the client 402 may comprise any suitable electronic device.
  • the server 412 has a data processor (DP) 414, a memory (MEM) 416 coupled to the DP 414 and a transceiver (TRANS) 418 coupled to the DP 414.
  • the TRANS 418 enables bidirectional communication with the client 402.
  • the MEM 416 stores a file system (FS)
  • FS 420 and the AS 422 may be stored in or provided by separate components (e.g., two memories, two circuits, two integrated circuits, two processors).
  • the server 412 may comprise any suitable electronic device.
  • the MEMs 406, 416 may be of any type appropriate to the technical environment and may be implemented using any appropriate data storage technology, such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory and removable memory, as non-limiting examples.
  • the DPs 404, 414 may be of any type appropriate to the technical environment, and may encompass one or more of microprocessors, general purpose computers, special purpose computers and processors based on a multi-core architecture, as non-limiting examples.
  • Exemplary embodiments of the invention or various aspects thereof, such as the authentication service, as a non-limiting example, may be implemented as a computer program stored by the respective MEM 406, 416 and executable by the respective DP 404, 414.
  • a method for a client unit to interact with a file system stored by an untrusted server unit comprising: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure (501); issuing to the untrusted server unit an operation to be performed on the file system (502); and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit (503).
  • a method as above further comprising: verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
  • the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued only if the proof for the first operation is verified to be authentic.
  • the digest comprises a first digest and the operation comprises an update operation, the method further comprising: in response to determining that the proof is authentic, using the proof to compute a second digest and storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
  • the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • the memory comprises a shared storage unit that is also accessible by at least one other client unit.
  • a storage requirement of the client unit for storing the digest remains substantially the same over time.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the method is implemented by a computer program.
  • the method is implemented by a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising the steps of performing the method.
  • a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions to interact with a file system stored by an untrusted server unit, the actions comprising: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure; issuing to the untrusted server unit an operation to be performed on the file system; and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit.
  • a program storage device as above the actions further comprising: verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
  • a program storage device as in the previous wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued only if the proof for the first operation is verified to be authentic.
  • the actions further comprising: in response to an unsuccessful verification, obtaining information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system.
  • a program storage device as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, the actions further comprising: in response to determining that the proof is authentic, determining that the answer is authentic.
  • the digest comprises a first digest and the operation comprises an update operation, the actions further comprising: in response to determining that the proof is authentic, using the proof to compute a second digest and storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
  • a program storage device as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • a program storage device as in any above wherein the client unit arbitrates communication with the untrusted server unit on behalf of a plurality of users.
  • the memory comprises a shared storage unit that is also accessible by at least one other client unit.
  • a storage requirement of the client unit for storing the digest remains substantially the same over time.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • an apparatus comprising: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor.
  • the data processor is further configured to verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
  • the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed by the data processor for a second operation, wherein the second operation is issued by the data processor only if the proof for the first operation is verified to be authentic.
  • the data processor is further configured, in response to an unsuccessful verification, to obtain information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system.
  • an apparatus as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, wherein the data processor is further configured, in response to determining that the proof is authentic, to determine that the answer is authentic.
  • the digest comprises a first digest and the operation comprises an update operation
  • the data processor is further configured, in response to determining that the proof is authentic, to use the proof to compute a second digest and to store said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
  • the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • the digest comprises only the cryptographic hash value.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • An apparatus as in any above wherein the apparatus arbitrates communication with the untrusted server unit on behalf of a plurality of users.
  • the memory comprises a shared storage unit that is also accessible by at least one other apparatus.
  • a storage requirement of the memory for storing the digest remains substantially the same over time.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the apparatus comprises a client electronic device.
  • an apparatus comprising: means for storing a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; means for communicating; and means for issuing to the untrusted server unit via the means for communicating an operation to be performed on the file system, wherein the means for communicating is further for receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the apparatus.
  • the means for storing comprises a memory
  • the means for communicating comprises a transceiver
  • the means for issuing comprises a data processor.
  • the means for re- computing comprises the data processor.
  • An apparatus as in any above further comprising means for verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
  • the means for verifying comprises the data processor.
  • An apparatus as in any above, wherein the means for obtaining comprises the data processor.
  • an apparatus as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, wherein the means for verifying is further for, in response to determining that the proof is authentic, determining that the answer is authentic.
  • the digest comprises a first digest and the operation comprises an update operation
  • the apparatus further comprising: means for, in response to determining that the proof is authentic, using the proof to compute a second digest and means for storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
  • the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • the digest comprises only the cryptographic hash value.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • An apparatus as in any above wherein the apparatus arbitrates communication with the untrusted server unit on behalf of a plurality of users.
  • the memory comprises a shared storage unit that is also accessible by at least one other apparatus.
  • a storage requirement of the means for storing the digest remains substantially the same over time.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the apparatus comprises a client electronic device.
  • a method comprising: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system (601 ); receiving from a client unit an instruction to perform an operation on the file system (602); and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure (603).
  • the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
  • transmission of the proof to the client unit is performed by an authentication service stored in a second memory accessible by the untrusted server unit.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • a method as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the method is implemented by a computer program.
  • the method is implemented by a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising the steps of performing the method.
  • a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system; receiving from a client unit an instruction to perform an operation on the file system; and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
  • the actions further comprising: performing the operation on the stored file system.
  • a program storage device as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • a program storage device as in any above wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
  • transmission of the proof to the client unit is performed by an authentication service stored in a second memory accessible by the untrusted server unit.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • a program storage device as in any above wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • an apparatus comprising: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
  • the data processor is further configured to perform the operation on the file system.
  • An apparatus as in any above wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • transmission of the proof to the client unit is performed via an authentication service stored in a second memory accessible by the data processor.
  • the digest comprises only the cryptographic hash value.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the apparatus comprises an untrusted server.
  • an apparatus comprising: means for receiving from a client unit an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system; means for transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
  • the means for receiving comprises a receiver and the means for transmitting comprises a transmitter.
  • the means for receiving and the means for transmitting comprise a transceiver.
  • An apparatus as in any above further comprising: means for performing the operation on the file system.
  • An apparatus as in the previous, wherein the means for performing comprises a data processor.
  • An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
  • An apparatus as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
  • transmission of the proof to the client unit is performed via an authentication service stored in a means for storage accessible by the apparatus.
  • An apparatus as in any above further comprising: means for storing the file system.
  • An apparatus as in the previous, wherein the means for storing comprises a memory.
  • the digest comprises only the cryptographic hash value.
  • the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
  • the tree structure comprises a skip list or a dynamic tree.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number.
  • the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
  • the cryptographic hash value comprises a collision-resistant cryptographic hash value.
  • the apparatus comprises an untrusted server.
  • a system comprising: a client unit comprising: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor; and an untrusted server unit comprising: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, where
  • Some exemplary embodiments of the invention provide an efficient way to authenticate an outsourced untrusted file system.
  • skip lists and dynamic trees are applied and an efficient hashing scheme may be used to represent the entire file system with a small, constant size digest. That is, the client maintains a constant size state of the entire file system. This state is an efficient representation both of the contents of the files and the hierarchy of the file system. This allows for applications where low-computing power and/or low-storage devices (e.g., sensors, smartcards, portable storage devices such as flash cards) are used to access an outsourced file system in a secure way.
  • low-computing power and/or low-storage devices e.g., sensors, smartcards, portable storage devices such as flash cards
  • there is no authentication information on the client's side that is proportional to the size of the file system, something that is the case for most previous techniques.
  • the set of query and update operations on the file system can efficiently be authenticated.
  • the client authenticates these operations by receiving the verification or consistency proof from the server.
  • Common and important file system operations, such as cd and Is, can be authenticated in logarithmic time.
  • the exemplary embodiments of the invention are not limited solely to a UNIX system nor the identified UNIX commands, and may be utilized in conjunction with other suitable systems, commands or architectures.
  • connection or coupling any use of the terms "connected,” “coupled” or variants thereof should be interpreted to indicate any such connection or coupling, direct or indirect, between the identified elements.
  • one or more intermediate elements may be present between the “coupled” elements.
  • the connection or coupling between the identified elements may be, as non-limiting examples, physical, electrical, magnetic, logical or any suitable combination thereof in accordance with the described exemplary embodiments.
  • the connection or coupling may comprise one or more printed electrical connections, wires, cables or any suitable combination thereof.
  • various exemplary embodiments of the invention can be implemented in different mediums, such as software, hardware, logic, special purpose circuits or any combination thereof.
  • some aspects may be implemented in software which may be run on a computing device, while other aspects may be implemented in hardware.

Abstract

In an exemplary embodiment, a client maintains a minimal cryptographic state ('digest') that consistently represents an outsourced file-system, against which file-system operations performed by the untrusted server can be verified to determine if they have been executed correctly. In one non-limiting example, a method for a client unit to interact with a file system stored by an untrusted server unit includes: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest includes a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure (501); issuing to the untrusted server unit an operation to be performed on the file system (502); and receiving a result and a proof in response to the operation, wherein the proof includes information that enables re-computation of the digest by the client unit (503).

Description

AUTHENTICATION FOR OPERATIONS OVER AN OUTSOURCED FILE SYSTEM STORED BY AN UNTRUSTED UNIT
TECHNICAL FIELD:
The exemplary embodiments of this invention relate generally to data authentication and, more specifically, relate to authentication for operations over an outsourced file system stored by an untrusted unit.
BACKGROUND:
Current trends in data-storage systems design are towards decentralized and networked architectures with minimal trust assumptions. In these settings, data security and correctness verification are necessary features for ensuring system trustworthiness. With the advancement of networking technologies and the development of storage networking protocols, networked file systems (e.g., NAS or SAN), where file storage is outsourced to remote untrusted storage devices, have emerged as a common practice, increasing the need for security.
One consideration is that of authenticating an outsourced file system in a setting where data resides at remote storage units of untrusted host machines, outside of any administrative control. It is generally desirable to efficiently (e.g., with logarithmic complexity) verify the integrity of a dynamic file system, namely to verify that its status is consistent with the history of file-system operations ordered by a client, and correctly detect any malicious access or data-retrieval patterns by the server. In particular, one goal is to verify the directory hierarchy of the file system, an important task, since, in many cases, the integrity of a file depends not only on its content, but also on its location in the file system. For example, the context of an .htaccess file depends on its location - its
contents identify access policies, but its location is critical to identify the directories it protects.
One conventional technique is to have the client (which can abstract to an operating system (OS) kernel supporting many users) sign each file system update it makes in the outsourced file system (e.g., using a hashed message authentication code (HMAC) based on a key that it keeps secret from the server). This technique has some drawbacks, however. First, it allows for replay attacks since determining file freshness is difficult with such a system. Second, this technique requires the client to sign every possible path in the directory hierarchy in order to be able to authenticate locations. This last requirement is especially inefficient, for example, when the client performs the directory operation mv that moves a large directory to a new location. Another technique is to assume that the outsourced file system is partially trustworthy or has some tamper- resistant trusted hardware as a part of its architecture (e.g., using trusted computing platforms). Such an assumption involves postulating that the networked file system is itself at least partially trusted, which is not that much different than simply trusting the hosting server in the first place.
SUMMARY: In one exemplary embodiment of the invention, a method for a client unit to interact with a file system stored by an untrusted server unit includes: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure; issuing to the untrusted server unit an operation to be performed on the file system; and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit.
In another exemplary embodiment of the invention, an apparatus includes: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor. In a further exemplary aspect of the invention, a method includes: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system; receiving from a client unit an instruction to perform an operation on the file system; and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
In another exemplary aspect of the invention, an apparatus includes: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
BRIEF DESCRIPTION OF THE DRAWINGS:
The foregoing and other aspects of embodiments of this invention are made more evident in the following Detailed Description, when read in conjunction with the attached Drawing Figures, wherein:
FIG. 1 shows a schematic illustration of an exemplary authenticated storage model within which exemplary embodiments of the invention may be utilized;
FIG. 2(a) depicts an exemplary skip-list hashing scheme for verifying operations on a map data structure;
FIG. 2(b) illustrates an exemplary consistency proof P for the exemplary skip-list hashing scheme of FIG. 2(a); FIG. 3 (a) shows an exemplary tree structure T corresponding to an exemplary file system;
FIG. 3(b) depicts an exemplary tree of paths T" corresponding to the exemplary tree structure T shown in FIG. 3(a);
FIG. 4 illustrates a simplified block diagram of various exemplary electronic devices that are suitable for use in practicing the exemplary embodiments of this invention;
FIG. 5 depicts a flowchart illustrating one non-limiting example of a method for practicing the exemplary embodiments of this invention; and
FIG. 6 depicts a flowchart illustrating another non-limiting example of a method for practicing the exemplary embodiments of this invention.
DETAILED DESCRIPTION:
1.0 OTHER PREVIOUS AND RELATED WORK
Authenticated Data Storage Systems. Previous and related work on authenticated data storage considers integrity at the file or data block level. Most of the systems provide file integrity using authentication information at the client that is proportional to the size of the file system. The most efficient constructions generally involve the use of Merkle trees over the data blocks of a single file. SUNDR system provides integrity protection by employing hash-tree schemes and digital signatures. However, it makes heavy use of signatures (e.g., every operation is signed). SUNDR does not assume the existence of a secure module (i.e., client) and operates in a fully-distributed setting, thus it suffers from consistency limitations inherent in this multi-client model. Similarly, SiRiUS system stores a digital signature for each file. In another approach, an efficient authentication scheme of large, dynamic data sets using Galois/Counter Mode is described where a constant amount of memory is used. In a different approach, efficient protocols for proving availability of static data in remote untrusted storage units are presented. Finally, a recent technique for authenticated network storage proposes to use a Merkle tree as the underlying data structure; however, PKI is used and the hash root is outsourced to an external medium, raising communication as well as security issues. In another technique for authenticated storage, the trusted digest is remotely maintained by the client at the server side, assuming a trusted storage component.
Checking and Verification. Related to authenticated storage is memory checking, where a trusted checker checks the correctness of an untrusted memory, using cryptographic primitives and hashing primitives. The model of authenticated data structures studies cryptographic techniques for authenticating that data structures that reside at untrusted hosts operate reliably with respect to updates performed by the owner of the data and queries issued by the users of the data structure. Techniques are known for authenticating the results of different types of queries, including set-membership queries, SQL queries on databases, geometric queries, and XML queries. Although authenticated data structures are closely related to the authenticated storage problem, there exist important differences in the models used. Authenticated data structures use a three-party data verification model and the data source stores the entire data set. In contrast, outsourced data authentication (i.e., the authenticated storage problem) uses a two-party model and the data source stores only minimal information about the data, and consequently, data consistency and replay-attack safety are more challenging to meet.
1.1 INTRODUCTION
Previous work on authenticated data storage considers integrity only at the file or (data block) level and do not provide efficient verification of the complete view of a file system
(for instance, of its directory structure).
In other research by Jammalamadaka et al., updates are very inefficient and difficult to handle - in particular, the update cost is proportional to the size of the entire file system.
See R. C. Jammalamadaka, R. Gamboni, S.Mehrotra, K. E. Seamons, and N.
Venkatasubramanian. gVault: A gmail based cryptographic network file system. In Proc. Working Conference on Data and Applications Security (DBSEC), pages 161-176, 2007.
Herein, the problem of authenticated storage is considered, where one desires to outsource a file-system to an untrusted server and yet ensure the file-system's integrity. New exemplary architectures for authenticated outsourced storage are introduced. Using light-weight cryptographic primitives and efficient data structuring techniques, the exemplary authentication schemes allow a client to verify that the file-system is fully consistent with the history of updates or queries requested by the client. File-system operations are verified in time that is logarithmic in the size of the file-system using optimal storage complexity, constant storage overhead at the client and asymptotically no extra overhead at the server. In contrast to conventional schemes that provide integrity guarantees only at the file or data-block level, exemplary schemes described herein additionally verify the file-system directory structure, thus supporting the authentication of complex file-system operations (e.g., directory moves and navigation) and file-system meta-data (e.g., general directory attributes). The exemplary architecture achieves generality by being platform-independent, as well as usability by operating transparently for end-users. Various exemplary embodiments of the invention may be referred to herein as "Athos" (AuTHenticated Outsourced Storage), supporting an authenticated networked file system that allows for efficient verification of, for example, the integrity of contents and locations of files and directories.
Using lightweight cryptography (hashing), provably secure protocols (subject to the collision resistant assumption) are constructed for authenticating file-system operations. At all times during its interaction with the untrusted server, the client maintains some minimal cryptographic state (a "digest") that consistently represents the file-system, against which any file-system operation performed by the server can be verified to determine whether it has been executed correctly. Computing this digest such that it can provide efficient verification algorithms for a variety of operations and yet guarantee file- system consistency is a challenging task. Using ideas from the domain of data authentication, an efficient data structuring technique is employed for representing an entire file system in a way that facilitates meeting verification and efficiency goals. Novel, exemplary techniques are presented for achieving consistency and file-system integrity verification: any update or query is validated at the client by having the server provide a succinct corresponding proof (e.g., through an authentication service module that runs in the untrusted memory and is, thus, also untrusted), which comprises, for example, partial data and hashes stored in the file-system data structure. Note that successful verification of a series of operations - both queries (e.g., Is operation) and updates (e.g., mv operation) - should guarantee not only that queries were answered correctly but also that updates were handled correctly. In such a manner, and unlike conventional techniques, one can achieve a proper notion of security that includes safety against replay attacks, for example.
1.2 CONTRIBUTIONS
With regards to verifying a file system of size n, various conventional techniques provide: (a) O( 1 ) client storage, 0{ri) update complexity and no hierarchy authentication; (b) O(log ή) client storage, O(log ή) update complexity and no hierarchy authentication; (c) O(n) client storage, O(log ή) update complexity and no hierarchy authentication; and/or (d)
O(I) client storage, O(n) update complexity and hierarchy authentication. In contrast,
Athos provides improvements over these techniques by enabling O(I) client storage, O(log ή) update complexity and hierarchy authentication for verification purposes.
The exemplary embodiments of the invention further provide:
• Exemplary authentication schemes for efficiently providing not only integrity of the stored data, but also integrity of the file system structure. Accordingly, file system consistency is achieved, where the system's state can be verified to be in accordance with the series of operations performed on it.
• Athos achieves optimal storage usage and efficient verification checking. In particular, the client has only constant size storage overhead and the server has asymptotically no extra storage overhead. Any update and query operation in the file system is performed and verified in time that is generally sublinear in the file-system size and logarithmic when the operand of the operation is of constant size.
• Athos achieves generality by being agnostic of the specific implementation of the networked file system and by being platform-independent. It requires no modification in the file system modules. It is also compliant with file-systems that are shared by multi- users or that support data confidentiality. 2. MODEL AND DEFINITIONS
As mentioned before, it is generally desirable to authenticate both the contents and the tree structure of a file system that resides in a trusted server and to validate the results of relevant queries. For example, suppose one is currently in the directory y/. It would be
preferable to be able to verify the results of queries issued to the dynamically evolving file system, for instance, to derive authenticated answers for the following questions: Does file x exist in directory yl (Is x)? What is the current working directory (pwd)? Or,
list the files of directory yl (Is). Also, one may wish to verify that updates are handled
correctly by the remote server, including the following operations: create a new directory/file (mkdir/touch); remove an existing directory/file (rmdir/rm); and/or move an existing file or directory to a new location (mv).
FIG. 1 shows a schematic illustration of an exemplary authenticated data storage model 10 within which exemplary embodiments of the invention may be utilized. Owned by a client C 12 but hosted at a remote untrusted server S 14, a file system FS 16 evolves over time through a series of update and query operations, issued by C 12 and executed by S 14. At all times, C 12 stores a succinct state 20 (e.g., a digest of few bytes) of FS 16 that is consistent with the entire history of operations. Authentication of operations is performed through verification or consistency proofs that are provided to C 12 (along with any answers to the operations provided by FS 16) by an authentication service module 18 that runs in memory independently of FS 16 and is controlled by S 14. A proof is used by C 12 to verify the current operation and consistently update the state (digest). That is, the authentication service module 18 stores additional authentication information 22 about FS 16. The file system may be generated and queried through the series of update and query operations.
Consider the following as a non-limiting example of the potential operations involved. The client C 12 sends a query x 24 on the FS 16 to the untrusted server 5 14. The server S 14 performs the query operation and obtains an answer a 26. The authentication service 18 generates a proofs 28. The answer a 26 and the proof >> 28 are sent to the client C 12. Although shown in FIG. 1 as being sent to separately, in other exemplary embodiments the answer a 26 and the proof jy 28 are sent together (e.g., in a single message or transmission) from the server S 14 to the client C 12. Similarly, although shown in FIG. 1 as being received by two separate components, the query x 24 may be received once by the server S 14, with the server S 14 internally sending the query x 24 to both be processed (i.e., to obtain the answer a 26) and to generate the proofs 28. The specific functionality of the query x 24, the answer a 26 and the proofy 28 are explained in further detail below.
If O is the set of operations supported over the file system FS, an exemplary communication protocol is as follows:
1. Client C keeps state information s and issues a query or update operation o D O to the server S.
2. Server S performs the query or update operation o by accordingly answering the query (i.e., obtaining an answer) or updating the FS to a new version FS', and by running an authentication service (AS), S generates a verification or respectively a consistency proof n (generally referred to herein as a "proof) which is returned to client C, along with the result p of the operation; p is the corresponding answer if operation o is a query or the
empty string _L, otherwise (e.g., if operation o is an update operation). This set of
operations can be represented as a certify algorithm: π<— certify (o, FS, FS', p).
3. Client C runs a verification algorithm which takes as input the current state s, the operation o along with its result p, and the corresponding (consistency or verification) proof π and either accepts or rejects the input. If the input is accepted (i.e., verified), the state s is appropriately updated to state s', where s' = s if o is a query operation (i.e., no change in the state s) or s' ≠ s otherwise (i.e., a change in the state s, for example, if the operation is an update operation). This set of operations can be represented as a verify
algorithm: {(yes, s'), (no, -L)}<— verify^, p,π).
If the input is rejected (i.e., the verification fails), the state s is not updated to state s'. In further exemplary embodiments, an error message or other indication of the failed verification may be output.
The above protocol and pair of algorithms (certify, verify) may be considered an authenticated storage scheme. The security requirement such a scheme should satisfy expresses the intuitive property that the verification performed at C is a reliable test for the file system's integrity. Let operate^, ) be the algorithm that, given the current file
system FS and an operation o e O, performs o and updates the file system to FS'. One
can write: (FS', p)<— operate(ø,FS) (p = J_ for updates and FS' = FS for queries). State s
is consistent with FS for a series of operations τ on FS' ifs and FS have been computed by running algorithms operate, certify and verify sequentially for all operations in series τ starting from FS' . Security is defined as follows.
Definition 2.1 (Security for authenticated storage
Figure imgf000012_0001
authenticated storage scheme (certify, verify) (with security parameter K) is secure, if for any series of
operations τ and a state s that is consistent with file system FS for τ on an initially empty file system, the following requirements are satisfied:
Correctness. It holds that (yes,
Figure imgf000012_0002
certify(o,FS,operate(o,FS))). That is, if the new operation o is performed correctly and the proof is generated using algorithm certify, then the verification algorithm accepts and computes the new state s' that consistent with the new file system.
Consistency. For any polynomial-time adversary A, having oracle-access to algorithms certify and verify, that on input file system FS and operation o produces proof π and result p, whenever (yes, s') <— verify(s,p,π), then the probability that either (FS', p) <—
operate(o,F5) does not hold or s' is not consistent with FS' for operation o on FS is
negligible in the security parameter K. That is, assuming a polynomially bounded adversary that observes a polynomial number of protocol invocations and then produces a pair of (verification or consistency) proof π and result p, if p and πfor the new operation o are accepted by the verification algorithm, then for all but negligible probability the operation has been performed correctly and the new state is consistent with the new file system.
Starting from an initially empty set and using a secure authenticated storage scheme and the appropriate series of updates, client C is able to "export" any file system to server S, such that C has a consistent state with the current file system. Therefore, the file system is consistent with the history of updates and all future operations will be verified. With respect to efficiency, an authenticated storage scheme is time-efficient if the verification time is sub-linear in the file-system size \FS\. An authenticated storage scheme is space- efficient if the state stored by the client C is sub-linear in \FS\ or space-optimal if the state is of constant size. In Section 4 below, an exemplary lime-efficient, space-optimal and secure authenticated storage scheme is presented for a rich set of operations on an outsourced file system.
3. EFFICIENT AUTHENTICATED STORAGE
As a non-limiting example, to achieve the efficiency and security goals, one may design an authenticated storage scheme where the state kept by the client is simply a hash value (e.g., similar in principle to memory checking) that is produced using a hashing scheme over the underlying data set. A hashing scheme for a certain query type Q describes a systematic method for computing a digest from an underlying data set (e.g., a file system) by hierarchically applying a cryptographic hash function (e.g., that is collision-resistant) over data objects and previously produced hash values. Hashing schemes produce digests satisfying an important property: answers to queries in Q on the data set define sequences of hash values produced by the hashing scheme that can serve as proofs of the answers' correctness, subject to the (correct and authentic) data digest. The inspiration stems from authenticated data structures, where a data structure produced by a trusted source is replicated to an untrusted server for answering users' queries. Using authenticated (e.g., signed) digests, query verification is possible at/by the users. Conceptually, the exemplary authenticated storage models presented herein can be seen as the model of authenticated data structures, where the source and the user coincide. But the relation stops here, because exemplary models for the invention require that the client stores only a single digest, that protocols satisfy a stronger notion of security, and because there are no known efficient hashing schemes for general file-system operations.
Exemplary hashing schemes are discussed that satisfy three important properties: (i) they are specialized for verifying a rich set of file-system operations; (ii) they define the file- system digest so that it not only encodes information that provides proofs for verifying the results of queries, but also encodes information that provides proofs for verifying updates performed in the file system; and (iii) the proof of any operation has size that is logarithmic in the size n of the entire file system and after any update the new digest can be computed with logarithmic in n cost. An exemplary hashing scheme is maintained at the authentication service module (AS) of the server. The following invariant is maintained: the client maintains the correct digest of the current file system (as if it was computed by the client). This way, any query operation can be verified by having the (untrusted) AS provide the client with the corresponding (verification) proof. Any update operation can be verified by designing the hashing scheme to include information that can be used to check the file system's integrity after an update, and by having the AS provide the client with this information as a (consistency) proof for the verification algorithm verify. In some exemplary embodiments, the verification can be divided, conceptually, into two steps: (1) first, the provided information is itself verified to be authentic subject to the existing state (i.e., consistent digest before the update), and (2) second, the (authenticated) information is used by the client to verify the updated file system (i.e., after the update is performed) and compute the new state (digest), which is now consistent with this update since it is computed using authentic data. In essence, the consistency proof is such that the client is able to locally rerun the same update over the hashing scheme and the data structure and thus validate the new, updated state of the file system. Proof Sketch of Security. The exemplary authenticated storage schemes are based on the following general approach. Given a secure hashing scheme H for a specific query type Q, such that H corresponds to an underlying authenticated data structure ADS, one augments H to a new hashing scheme H, such that H additionally encodes in its (defining) digest the entire structural and balancing information that exists in ADS. In particular, if hash value hv corresponding to node v of the data structure is computed as A1, = h(h ,..., hUk ) in H, in H' one defines hv = h(h ,..., hUi , h(bv , sv )), where bv, sv describe all the balancing and respectively structural information about node v in the data structure. In the exemplary constructions, the query types are set-membership or path property queries using the skip-list and the dynamic-tree data structure, respectively, and the corresponding hashing schemes of Goodrich 1 and Goodrich2 (see below), respectively.
Goodrichl : M. T. Goodrich, R. Tamassia, and A. Schwerin. Implementation of an authenticated dictionary with skip lists and commutative hashing. In Proc. 2001 DARPA Information Survivability Conference and Exposition, volume 2, pages 68-82, 2001.
Goodrich2: M. T. Goodrich, R. Tamassia, N. Triandopoulos, and R. Cohen. Authenticated data structures for graph and geometric searching. In Proc. RSA Conference — Cryptographers ' Track, volume 2612 of LNCS, pages 295-313. Springer, 2003.
Given the augmented hashing trees, security is proven as follows. Starting from the (known) state corresponding to the empty file system, one inductively shows that after any update on the file system the client C updates its state s consistently for the new update on the currently existing file system FS. For both data structures used in the exemplary schemes, the consistency proof by the definition of the corresponding augmented hashing scheme H contains all the balancing and structural information that completely characterizes the changes in FS due to the update. Assuming that the state is consistent, the consistency proof coming from an honest server S will be verified, thus also providing verification of the balancing and structural information related to the update. (This also verifies that the previously issued update has been executed correctly by S.) Thus, C is able to locally perform the correct update as if C had direct access to the entire, (correct) current file system FS. In such a manner, C is able to correctly and consistently update the state s to s', which is simply the new digest according to H. Given this invariant, any query can be securely verified since the underlying hashing scheme is secure. Additionally, any malicious behavior by S will be rejected by the verification algorithm, assuming that finding collisions is computationally difficult.
3.1 OUTSOURCED SKIP LIST
To give some intuition of the above discussion, consider an exemplary special case where one wishes to implement an authenticated map. This authentication functionality on the map data structure will also be a core authentication tool for verifying file system operations. Each entry of the map is a tuple (k, v), where k is a key and v is the value that corresponds to k; v can be a collection of objects as well. The entries of the map are sorted according to their keys (e.g., by using a comparator). The authenticated map data structure resides in the server. Using a hashing scheme designed over skip-lists, one can define the digest for the authenticated map, computed according to the tree structure of the skip-list (see Figure 2(a) as discussed below).
FIG. 2(a) depicts an exemplary skip-list hashing scheme 30 for verifying operations on a map data structure and FIG. 2(b) illustrates an exemplary consistency proof P 50 for the exemplary skip-list hashing scheme 30 of FIG. 2(a).
The exemplary skip-list hashing scheme 30 shown in FIG. 2(a) has a number of entries, each one having a tuple (k, v). In particular, FIG. 2(a) illustrates the insertion of key 14.
The exemplary consistency proof P 50 shown in FIG. 2(b) is returned by S in response to the update operation (i.e., to insert key 14). The proof P 50 contains all the hashing and structural information needed to verify the consistency of P 50 subject to the current digest (i.e., before the update) and to locally perform the update and generate a new digest corresponding to the updated skip list.
Let do be the initial digest stored at the client C which is consistent with the current state of the skip list. Suppose now that C wants to insert a new key x (e.g., key 14 as shown in FIG. 2(a)). The server S returns to C a consistency proof (e.g., proof P in FIG. 2(b)) that consists of the search path P in the unsuccessful search for x before the update. Path P is related to the key insertion, satisfying the following two exemplary properties, which in turn imply the security of the scheme:
• P contains the two keys, for example, succ(x) and pred(x), that are the successor and predecessor of x in the ordering of the keys. P also contains all the necessary hashing information (e.g., hash values) that allow C to recompute the digest do starting from succ(x) and pred(x) and hashing according to the hashing scheme that is used. Due to the collision-resistance property of the hash function, C can tell if the received path is the correct one. IfP is verified, C verifies that key x is not in the directory. Also, C knows the position at which this file should be added.
• P contains all the necessary structural information that enable C to locally perform the update in the hashing scheme that corresponds to the file insertion, by placing x between succ(x) and pred(x) and computing the new hash values for only those nodes of the skip list that need a new hash. Knowing the new hash values, C can compute the new digest d'o , which is consistent with the insertion operation.
In such a manner, the key insertion (performed by S) can be verified in two steps: (1) first, path P is verified and then (2) it is used to locally perform the update and compute the new digest. Using results on the complexity of authenticated skip-lists and the above exemplary protocol, one has the following:
Lemma 3.1 There exists an authenticated storage scheme for operations on key-value pairs in a map that is based on an authenticated skip list, with the following expected complexity bounds:
1. The expected update (insertion and removal), query and verification time is 0(log«) with high probability.
2. The expected size of the consistency and verification proof (communication cost) is O(log«) with high probability. Update time is the time required by S to do the actual update, query time is the time S needs to compute the (consistency or verification) proof, verification time is the time that C needs in order to process the proof and validate or reject the query or the update. Note that for set-membership queries and updates (e.g., through which tone can implement all file system operations) the size of a proof is asymptotically equal to the verification time; therefore, the verification time bounds will indirectly imply the size of the proof.
3.2 AUTHENTICATED FILE SYSTEM USING AN OUTSOURCED SKIP LIST
Below is presented an exemplary authenticated file system implemented with a skip list. Let Tbe the tree that corresponds to the file system. Each entry of the map corresponds to a node v of T and has the following format:
(key(v), [name,file,key(parent),key(sibling),key(backsibling),key(child)])
where the above values have the following meanings:
• key(v) is the key of the specific entry, a unique id for each node of the file system (e.g., the i-node of the file system node, obtainable in UNIX by using the Stat command);
• name is the actual name of the node of the file system;
• file is a hash of the file represented by v (e.g., if v is a directory node then file = null), which may be computed, for example, with SHA-I ;
• key(parent) is the key that corresponds to the parent node of v in the file system;
• key(sibling) is the key of the node that corresponds to the sibling of v according to the order of their creation (e.g., the first child of a node is considered to be that node that was created most recently). Note that if a node v is the last node of the children list, then this field, for example, may be null;
• key(backsibling) is the key of the node that corresponds to the sibling just before v according to the order of their creation. Note that if a node v is the first node of the children list, then this field, for example, may be null;
• key(child) is the key of the node that corresponds to the first child of v in the above described order.
Below, and as a non-limiting example, each query/update to the file system is mapped to a standard query/update operation in the authenticated map. Let FI = πjπ2 . . .π* (and respectively for the mv operation one has FI' = π' \π' 2 • • .π' k ) be a path in the file system.
Theorem 3.2 (Representation with Skip Lists) Assuming the existence of collision- resistant hash functions, there exists a secure and space optimal authenticated storage scheme that is implemented with skip lists and achieves the following performance, where n is the size of the file-system:
• The authentication of any path Yl takes ((Yl) = O(k log n) query/verification time .
• Query operations Cd(FI), read(Fl) and update operations, wrϊte(FI), rm(FI), mkdir(n); and touch(n) take t{T\) query/verification and update/query/verification time, respectively.
• Query operation Is(FI) takes /(FI)+O(E log n) = O((k+Z) log n) query/verification time, where C is the size of the children list. • Update operation rmdir(π) takes
Figure imgf000020_0001
log ή) = 0{(k+\T\) log n)
update/query/verification time, where T is the subtree rooted at π^.
• Update operation mv(n,lT) t(U)+t(W) = O((k+k') log n) takes
update/query/verification time.
Proof of Theorem 3.2.
Proof: Let n be the path one wants to authenticate. Suppose one knows the id key(π^) of the node π^ (e.g., this can be done by using the Stat command in UNIX). One issues the query contains(key(π/t)). Then one gets back the values associated with key(π*). By following parent relations and checking that the field name matches the respective name of the certain node of the path Fl, the authenticated query takes time O(k log ri), where one issues k queries to the skip list. For the operation cd(n),only the path FI has to be authenticated, hence the bound follows. For the operation read(FI),the path FI is authenticated and it is checked to see if the field file of π* equals the respective cryptographic hash of what is being read. The operation Is(FI) is basically an authentication of the path FI and then one has to follow sibling relations to check that what is being retrieved from the file system (in some order) by executing Is is equal to the authenticated information that one gets from the skip list. Hence one needs time O((k+t) log n). For the operations mkdir(FI), touch(FI), first the path πiπ2. . .π^i is authenticated. Then a new id x is created for the new node π* (e.g., this can be done by actually creating the path in the file system and then calling Stat to get the i-node) and then the pointers are updated accordingly. Then the record with key x is inserted, as created above (with the updated pointers). Since only a constant number of pointers are updated, the complexity bound follows. Similarly, for the operations write(Fl) and rm(FI)the path FI is authenticated. Let x be the i-node of node π*. For the case of rm, one simply removes key x and updates the pointers, namely the sibling of the predecessor of x is set to be the successor of x (and not x) and perform an analogous operation for the field backsibling of x. For the case of write, key x is reinserted with a different file field. Operation rmdir(n) can be viewed as |7] operations rm. Hence the complexity bound follows. For operation mv(π,π'), both paths El1IT are authenticated and then keys π* and πV are
reinserted with different parent and child relations.
Another exemplary method of representing the file system using a skip list is the following. Instead of storing the i-node number for the key of a node v, one can use as key the name of the path from the root to node v (for example, the key for the file lying in
/users/user/pub.txt will be the string "/users/user/pub.txt"). In such a manner,
"global" information is stored, while in the previous representation "local" information was stored. This solution yields better complexity bounds for the path authentication (which is now /(FI) = O(log /?+|n|)). However, update operation mv(IT,n') takes
update/query/verification time
Figure imgf000021_0001
log ή), where T is the subtree rooted at π|π|. This representation is suitable for cases where the majority of the operations are file system navigations and move operations are less frequent.
4. MORE EFFICIENT AUTHENTICATED STORAGE
The below descriptions provide exemplary embodiments of the invention that enable more efficient authenticated storage utilizing the model described above in Section 3.
Let T be the (generally unbalanced) tree that represents the file system according to the directory hierarchy, where the topological (left-to-right) ordering of sibling nodes is also the lexicographical ordering of the corresponding files and directories. The leaves of T are either files or empty directories. In one exemplary embodiment, transform T (and the file system, essentially) to a new data structure which is essentially a tree TOf paths (see FIG. 3(b), as described below). In this exemplary embodiment, the data staicture is based on dynamic trees.
FIG. 3(a) shows an exemplary tree structure T 70 corresponding to the file system. FIG.
3(b) depicts an exemplary tree of paths 1T 90 corresponding to the exemplary tree structure T 70 shown in FIG. 3(a). Nodes that belong to the dashed paths are duplicated. Dynamic Trees and T. The construction of the dynamic tree data structure from 'a tree T is briefly described. Tree T is transformed to a tree of paths "Tas follows. Paths in 'Tare defined by a path partition in the original file-system tree. 7Ms a rooted tree and its edges are classified as being either solid or dashed according to their weight in T (e.g., size of the subtree in T rooted at the lowest node of the edge), such that any internal node has at most one child connected by a solid edge. This edge classification partitions the nodes of the tree into solid paths connected with each other by dashed edges (see FIG. 3(a)). Every internal node v in Thas at most one child u connected through a solid edge. If v has other children (through dashed edges), say nodes u\, . . . ,Uk, then the dashed path d(v) of v is a path of length k such that there is a one-to-one correspondence between edges (u,, v) in T and nodes of d(v) and the ordering is preserved. The tree TOf paths is constructed by considering all solid and dashed paths defined for tree T and defining the parent-child relation according to their connectivity in the original tree T. That is, solid paths are the parent paths of the dashed paths they define in T and dashed paths d(v) are the parent paths of the solid paths whose nodes are descendants in T of node v. If each path in Tis represented as a biased binary tree (weight-balanced tree) using appropriate weights and if these individual trees are appropriately interconnected, then it is possible to obtain a final tree T that is balanced. In particular, any two nodes in the original tree T are connected in the final tree T through a path of logarithmic in \T\. Moreover, there are efficient algorithms in T for performing structural updates in the original tree; for instance, any subtree in T can change parent in logarithmic time in \T\.
In some exemplary embodiments, the tree T is used as the representation of the file system and also as the structure that will be used as a hashing scheme for defining the digest of the entire file system. This hashing scheme should be appropriately constructed so that it can be used to verify a broad class of query and update operation on the file system. The hashing scheme is used for authentication of path properties in trees. This hashing scheme is defined over trees of the form of the final tree T and has the following important property: given two nodes in the original tree T, the hashing scheme can be used to efficiently authenticate any "property" of the path connecting the nodes in T. In particular, assuming that tree nodes are associated with values or attributes, one can authenticate any function of the values or attributes along a tree path if the function is computed by applying an associative operator over the individual values. More formally, let /? =p' \\p" be a path in Tthat is the concatenation of paths/*' andp". A path property P satisfies the concatenation criterion if P(p)=F (P(p'),P(p")), where F is a function that can be computed in O(I) time; e.g., a path property that satisfies this property is the length of the path, where F is "addition".
This exemplary hashing scheme is extended to authenticate path properties not only for paths in the original tree T (i.e., properties of paths related to the parent-child relation), but also for dashed paths in the intermediate tree T (i.e., properties of paths related to siblings). This extension is performed by including in the hashing scheme information that is associated with the nodes of dashed paths, i.e., information associated with the files and subdirectories of any directory. Also, one can include in the dashed path d(v) related to v, the node in 7(RIe or subdirectory) that corresponds to the solid child of v in T (e.g., so that no file is missed). Finally, the exemplary hashing scheme can be augmented to include structural and balancing information related to T: in such a case the hash value of any node in T includes, for example, its sibling rank and weight.
Given the above, the operations of the file system should be related to certain path properties in 7Or dashed-path properties. In order to do this, define the appropriate path properties of interest: each node v of the tree is related with a constant-size set of node attributes {N\(v), . . . , N*(v)} . As non-limiting examples, these can be the weight of v or other variables that one desires to relate with the node. The set of these node attributes may be referred to as the node property 5V(v) of this node. For the case of the file system, define the node property 5V(v) of a node v to contain at least two attributes: S(v) and C(v). 5(V) is the name of the file or directory and C(v) is the hash of the file or directory. If node v represents a directory, define C(v) = {0}, otherwise C(v) is a hash of the node corresponding to file v. Similarly, every path /? is related with a set of path attributes {P\(p), ■ ■ ■ , Pkip)}- As non-limiting examples, these can be the length of a path or other variables that one would like to relate with this path. The set of these path attributes may be referred to as the path property <P(p) of this path. The path attributes can be defined, for example, as a function of the corresponding node attributes. In this exemplary case, define the first path attribute S(p) of a path/? = u\, . . . ,U( as S(p) = ® ,=i S(u,). This is
actually the name of the path (since <g> denotes "string concatenation"). The second path attribute may similarly be defined to be the content of the path C(p). Hence, C(p) =θ ,=i
C(U1), where φ is simply the union operator. Note that the content of a path that consists
only of directories is empty. Note that the path property Q(p) = (S(p),C(p)) for any path/? = p'\p" of the file system satisfies the concatenation criterion since S(p) = S(p')<g> S(p")
and C(p) =C(p')® C(p").
Further reference with regard to various aspects of the exemplary embodiments described above may be made to Goodrich2 (citation in Section 3 above).
4.1 AUTHENTICATED FILE SYSTEM USING DYNAMIC TREES
An exemplary application of the above formulation is now described within the framework of Goodrich2 for path-properties verification to support various authenticated file-system operations. Suppose the current directory is C. Let F# πiπ2 . . .tik be the argument (a directory path) of the operation in consideration. Let n denote the size of the file system. Below is shown how to authenticate the following operations, by reducing them to an appropriate path property query, using complexity analysis as described in Goodrich2. For the case of the dynamic trees, one has the following theorem:
Theorem 4.1 (Representation with Dynamic Trees) Assuming the existence of collision-resistant hash functions, there exists a secure, time efficient and space optimal authenticated storage scheme that is implemented with dynamic trees and achieved the following performance, where n is the size of the file-system:
• The authentication of any path FI takes /(FI) = O(k+\og ή) query/verification time.
• Query operations cd(FI), read(Fl) and update operations write(FI), rm(Fl), mkdir(FI), and touch (FI) take /(FI) query/verification and update/query/verification time respectively. ° Query operation Is(El) takes /(FF)+O(t+log ή) = O(k+t+\og n) query/verification time, where Z is the size of the children list.
° Update operation rmdir(π) takes /(El) update/query/verification time.
• Update operation mv(π,IT) takes /(π)+t(FT)+O(log n)=O(k+k'+\og ή) update/query/verification time.
Proof of Theorem 4.1: The authentication of any path FI is a query of the name of the path, namely a query of the property of the path, which according to Goodrich2 takes time O(log n+k). For the operation Is(TT), one queries for the name of the dashed path d(nii) that corresponds to node π*- (the names of the children of n^). The query and verification time is O(log n+t+k). For the operation Cd(FI), all one has to do is to query for the name of the path from πi to π^ (e.g., query for the attribute S(-)). This has query and verification time O(log n+k). Operation mkdir(n) (or touch(n) for a file) corresponds first to a cd operation (to authenticate the path FI) and then to the series of update operations (e.g., newTree(), link()) in the tree. These operations take time O(\ogn+k), O(I) and O(log n) respectively. Hence, the total time is O(log n+k). Likewise, operation rmdir(FI) (or rm(FI) for a file) corresponds first to a cd operation and then to the series of update operations (e.g., cut(), destroy!" ree()) in the tree. These operations take time O(log n+k), O(log n) and O(I ) respectively. Hence, the total time is O(log n+k). Operation mv(FI,n') (where FI' is the target directory) corresponds first to two cd operations (to authenticate two paths FI, FT') and then to the series of update operations (e.g., cut(), link()) in the tree (which both take time O(log n)). Hence, the total time is O(log n+k+k'). For operation read(Fl) (in this case last(FI) is a file), one needs to authenticate the contents of the file (i.e., that the file has not been tampered with) by querying for the contents of the path from node πι to π* (e.g., query for the attribute C(-))- Both operations have query and verification time O(log n+k). For the operation write(FI) (in this case last(FI) is a file), suppose FI' is the new directory (i.e., the structure of the directory does not change, only the contents of the final node). This operation is a sequence of two previous operations: rm(FI) and touch(FI'). Hence, the complexity is O(log n+k) (note that k = k'). For all of the operations in Theorem 4.1 that query for a path property <F{p), it may be the case that/* does not exist (e.g., due to a wrong argument from the end-user). In this case, the exemplary authentication scheme can provide a verification proof for negative answers by proving the existence of the two neighboring sibling nodes in a dashed path where the error occurs. In essence, this is, again, a path-property of a special type. In Table 2, a comparison of the three presented implementations, local skip lists, global skip lists and dynamic trees, for various file system operations is shown. The indicated efficiencies are for query/verification time or update/query/verification time. In Table 2, n is the size of the file system, Fi= πiπ2 . . .π* is the directory argument, C is the size of the children list and T is the subtree rooted on π*. Note that the dynamic trees clearly outperform skip lists in comparatively "expensive" operations such as mkdir and mv.
operation skip lists (local) skip lists (global) dynamic trees cd(ri). lead(n), write(n).
0{k\opή rm(n). mkdir(n). touch(n) O(logti +k) O(ϊopι + k)
Is(O) O((/t+ f) log») OiHlogn + lή)
Figure imgf000026_0001
rmdir(H) o((A-+ |r|)iog«) O(k+losιή mv(O.O') <9((λ- + λJ) log»)
Figure imgf000026_0002
O(k+kJ+]ogιή Table 2
Note now that according to the exemplary authenticated storage model, the client should be able to verify the correctness of the operation and to update the digest of the whole data structure by using the consistency proof sent by the server whenever an update takes place. But how does the client update the digest after an update in the dynamic trees case? In the case shown above (see Section 2) where an authenticated skip list is used to implement the exemplary authenticated storage scheme, the server sends a path P with hash values and other information related to the update. For the dynamic tree case, the consistency proof is generally more complex. One exemplary consistency proof contains all the structural and balancing information and all the node and path attributes of the nodes that should be accessed by the update algorithm in order to perform the operation. Note that this information is included in the hashing scheme. Accordingly, the client has all the information that, once authenticated using the current state (digest) is required for locally performing the update and computing the new digest. One can easily show that this consistency proof has logarithmic size: since all the update operations described above take logarithmic time, they cannot visit more than O(log w+|Il|) nodes of the tree. Hence, the server can send structural and hashing information of size O(log «+|π|) that allows the client to update the digest.
5. EXTENSIONS TO MULTIPLE USERS, RECOVERY AND PERSISTENCE
Multi-user Extension. The above-described exemplary authentication schemes are designed in the client-server model, thus achieving generality and widening the application areas in which they can provide solutions. However, one may expect that most real-life file-system integrity applications can involve a large number of users, for example, remotely and concurrently operating on a shared file-system. Unfortunately, outsourced storage authentication inherently requires interaction (and thus incurs high communication overheads) in a completely general multi-client single-server model. To see why, assume any secure protocol for verifying the integrity of outsourced storage, and consider an update on the data performed and verified by user A. Consider the next operation on the data issued by user B. It is easy to see that without interaction, and even if users locally keep an unbounded state, replay attacks are generally difficult, perhaps even impossible, to defeat, since a malicious server can simply completely ignore ^'s updates on the data, "rewind" its authentication-related state to its previous version before A's update and successfully make A's updates invisible to user B. Indeed, without interaction and independently of the use of cryptography in the scheme, B is not able to distinguish between the events for which A did not operate on the data and the events for which A operated on the data but a replay attack occurred. Consequently, replay attacks may be difficult to avoid, perhaps even unavoidable, in this setting. This restriction can be removed if users interact; however, this may result in substantially impractical protocols since n users would need to exchange Ω(n) messages after any update.
Alternatively, communication assumptions could be made. Briefly consider how the exemplary protocols can be applied in two such restricted multi-user settings, by making a general — and easy to meet in practice — communication assumption. First, if one assumes that different users belong in the same organizational unit and access a remote file system through the same network infrastructure, then the exemplary protocols are applicable to a single designated client, trusted by the users, which serializes all users' operations, verifying each operation locally. For instance, this may be the setting in a networked file system where many users share and operate on files that can be physically stored in remote and untrusted storage units, yet all users' requests are serialized in the system's filer. In this case, an exemplary verification client can constitute an add-on module of the hosting operating system kernel that runs in parallel with the system's filer. Also, if one assumes that different users are online from geographically remote - and even mobile - locations but can share a trusted storage of constant size, then the exemplary protocols are applicable by simply having the authentication digest of the file system be stored in this shared storage unit (and by possibly enforcing certain locking mechanisms for achieving concurrency), where verification of operations are performed locally at/by the users. For instance, users may share a secure web page or a file that is stored at (e.g., a single, trusted node of) a p2p storage network (e.g., accessed even by an untrusted node of it and using secure p2p searching techniques).
Recovery and Persistence. Consider some additional issues related to failure recovery and persistent authentication in a real-life usage of the exemplary authentication protocols and implementation of Athos' architecture. In the case of an unsuccessful verification of a file system operation, in further exemplary embodiments Athos can provide to the higher (or hosting) application complete information about the problematic operation and the current state of the file system in terms of its integrity. In particular, and as a non-limiting example, Athos functionality can characterize the exact location in the file system where integrity was not verified and thus pinpoint which file or directory was maliciously (or accidentally) modified by the untrusted server or by the remote storage devices. By keeping appropriate additional information, for example, the higher application is thus able to infer useful information for failure recovery and a complete view of the problem. For instance, one can find which concrete user and with which concrete operation most recently, correctly accessed the (currently problematic) file or directory. Additionally, and as a further example, by using the exemplary skip-list based authentication approach in combination with existing techniques for authenticating membership queries in the past (i.e., queries that span through previous states of a data set), Athos can offer persistent authentication capabilities, where file-system operations or queries about past views of the file system can be issued and authenticated. This property of Athos may be significant, since it can be useful for supporting a secure audit of the entire outsourced file system.
6. FURTHER DESCRIPTION OF EXEMPLARY IMPLEMENTATIONS
The exemplary embodiments of the invention generally relate to interactions between a user and an untrusted server. The exemplary embodiments of this invention may be implemented by one or more of the parties involved. As non-limiting examples, the user may comprise an electronic device or a portable electronic device. Such an electronic device may itself comprise at least one data processor, at least one memory, a communication component (e.g., a transceiver), and a user interface comprising a user input (e.g., mouse, keyboard, keypad, joystick, touchscreen, touchpad) and a display device (e.g., display, monitor, screen, touchscreen, liquid crystal display). As additional non-limiting examples, the user may comprise a software program or a plug-in application attached to another program. As a non-limiting example, the server may comprise a web service running on a distributed collection of computers accessible via the internet.
As a non-limiting example for implementing the exemplary embodiments, a cryptographic component may be employed. As further non-limiting examples, the cryptographic component may be a separate entity (e.g., an integrated circuit, an
Application Specific Integrated Circuit or ASIC) or may be integrated with other components (e.g., a program run by a data processor, functionality enabled by a data processor). Such a cryptographic component may perform various functions including encryption, cryptographic hashing, digital signature creation, digital signature verification and decryption, as non-limiting examples. As non-limiting examples, a system implementing the exemplary embodiments of this invention may comprise a private network (e.g., local area network - LAN), a public network (e.g., a publicly available wireless local area network - WLAN), or the internet.
The exemplary embodiments of this invention may be carried out by computer software implemented by a data processor or by hardware, or by a combination of hardware and software. As a non-limiting example, the exemplary embodiments of this invention may be implemented by one or more integrated circuits.
As an example, FIG. 4 illustrates a simplified block diagram of various exemplary electronic devices that are suitable for use in practicing the exemplary embodiments of this invention. FIG. 4 shows a system 400 having a client 402 and a server 412.
The client 402 has a data processor (DP) 404, a memory (MEM) 406 coupled to the DP 404 and a transceiver (TRANS) 408 coupled to the DP 404. The TRANS 408 enables bidirectional communication with the server 412. The MEM 406 stores a digest 410 in accordance with exemplary embodiments of the invention, as further described herein. The client 402 may comprise any suitable electronic device.
The server 412 has a data processor (DP) 414, a memory (MEM) 416 coupled to the DP 414 and a transceiver (TRANS) 418 coupled to the DP 414. The TRANS 418 enables bidirectional communication with the client 402. The MEM 416 stores a file system (FS)
420 and an authentication service (AS) 422 in accordance with exemplary embodiments of the invention, as further described herein. Note that in other exemplary embodiments, the functionality of FS 420 and the AS 422 may be stored in or provided by separate components (e.g., two memories, two circuits, two integrated circuits, two processors).
The server 412 may comprise any suitable electronic device.
The MEMs 406, 416 may be of any type appropriate to the technical environment and may be implemented using any appropriate data storage technology, such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory and removable memory, as non-limiting examples. The DPs 404, 414 may be of any type appropriate to the technical environment, and may encompass one or more of microprocessors, general purpose computers, special purpose computers and processors based on a multi-core architecture, as non-limiting examples.
Exemplary embodiments of the invention or various aspects thereof, such as the authentication service, as a non-limiting example, may be implemented as a computer program stored by the respective MEM 406, 416 and executable by the respective DP 404, 414.
7. VARIOUS EXEMPLARY EMBODIMENTS
Below are further descriptions of various non-limiting, exemplary embodiments of the invention. The below-described exemplary embodiments are numbered separately for clarity purposes. This numbering should not be construed as entirely separating the various exemplary embodiments since aspects of one or more exemplary embodiments may be practiced in conjunction with one or more other aspects or exemplary embodiments.
(1) In one exemplary embodiment, and as shown in FIG. 5, a method for a client unit to interact with a file system stored by an untrusted server unit, comprising: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure (501); issuing to the untrusted server unit an operation to be performed on the file system (502); and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit (503).
A method as above, further comprising: verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence. A method as in the previous, wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued only if the proof for the first operation is verified to be authentic. A method as in any above, further comprising: in response to an unsuccessful verification, obtaining information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system. A method as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, the method further comprising: in response to determining that the proof is authentic, determining that the answer is authentic. A method as in any above, wherein the digest comprises a first digest and the operation comprises an update operation, the method further comprising: in response to determining that the proof is authentic, using the proof to compute a second digest and storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system. A method as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. A method as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
A method as in any above, wherein the digest comprises only the cryptographic hash value. A method as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. A method as in any above, wherein the tree structure comprises a skip list or a dynamic tree. A method as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. A method as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
A method as in any above, wherein the client unit arbitrates communication with the untrusted server unit on behalf of a plurality of users. A method as in any above, wherein the memory comprises a shared storage unit that is also accessible by at least one other client unit. A method as in any above, wherein a storage requirement of the client unit for storing the digest remains substantially the same over time. A method as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. A method as in any above, wherein the method is implemented by a computer program. A method as in any above, wherein the method is implemented by a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising the steps of performing the method.
(2) In another exemplary embodiment, a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions to interact with a file system stored by an untrusted server unit, the actions comprising: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure; issuing to the untrusted server unit an operation to be performed on the file system; and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit.
A program storage device as above, the actions further comprising: verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence. A program storage device as in the previous, wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued only if the proof for the first operation is verified to be authentic. A program storage device as in any above, the actions further comprising: in response to an unsuccessful verification, obtaining information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system.
A program storage device as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, the actions further comprising: in response to determining that the proof is authentic, determining that the answer is authentic. A program storage device as in any above, wherein the digest comprises a first digest and the operation comprises an update operation, the actions further comprising: in response to determining that the proof is authentic, using the proof to compute a second digest and storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system. A program storage device as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. A program storage device as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
A program storage device as in any above, wherein the digest comprises only the cryptographic hash value. A program storage device as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. A program storage device as in any above, wherein the tree structure comprises a skip list or a dynamic tree. A program storage device as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. A program storage device as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
A program storage device as in any above, wherein the client unit arbitrates communication with the untrusted server unit on behalf of a plurality of users. A program storage device as in any above, wherein the memory comprises a shared storage unit that is also accessible by at least one other client unit. A program storage device as in any above, wherein a storage requirement of the client unit for storing the digest remains substantially the same over time. A program storage device as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
(3) In another exemplary embodiment, an apparatus comprising: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor.
An apparatus as above, wherein the data processor is further configured to verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence. An apparatus as in the previous, wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed by the data processor for a second operation, wherein the second operation is issued by the data processor only if the proof for the first operation is verified to be authentic. An apparatus as in any above, wherein the data processor is further configured, in response to an unsuccessful verification, to obtain information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system.
An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, wherein the data processor is further configured, in response to determining that the proof is authentic, to determine that the answer is authentic. An apparatus as in any above, wherein the digest comprises a first digest and the operation comprises an update operation, wherein the data processor is further configured, in response to determining that the proof is authentic, to use the proof to compute a second digest and to store said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system. An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. An apparatus as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
An apparatus as in any above, wherein the digest comprises only the cryptographic hash value. An apparatus as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. An apparatus as in any above, wherein the tree structure comprises a skip list or a dynamic tree. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
An apparatus as in any above, wherein the apparatus arbitrates communication with the untrusted server unit on behalf of a plurality of users. An apparatus as in any above, wherein the memory comprises a shared storage unit that is also accessible by at least one other apparatus. An apparatus as in any above, wherein a storage requirement of the memory for storing the digest remains substantially the same over time. An apparatus as in any of the above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. An apparatus as in any above, wherein the apparatus comprises a client electronic device.
(4) In another exemplary embodiment, an apparatus comprising: means for storing a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; means for communicating; and means for issuing to the untrusted server unit via the means for communicating an operation to be performed on the file system, wherein the means for communicating is further for receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the apparatus. An apparatus as above, further comprising: means for re-computing the digest using the received proof. An apparatus as in any above, wherein the means for storing comprises a memory, the means for communicating comprises a transceiver, the means for issuing comprises a data processor. An apparatus as in the previous, wherein the means for re- computing comprises the data processor.
An apparatus as in any above, further comprising means for verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence. An apparatus as in the previous, wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued by the means for issuing only if the proof for the first operation is verified to be authentic. An apparatus as in any above, wherein the means for verifying comprises the data processor. An apparatus as in any above, further comprising means for obtaining, in response to an unsuccessful verification, information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system. An apparatus as in any above, wherein the means for obtaining comprises the data processor.
An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, wherein the means for verifying is further for, in response to determining that the proof is authentic, determining that the answer is authentic. An apparatus as in any above, wherein the digest comprises a first digest and the operation comprises an update operation, the apparatus further comprising: means for, in response to determining that the proof is authentic, using the proof to compute a second digest and means for storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system. An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. An apparatus as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
An apparatus as in any above, wherein the digest comprises only the cryptographic hash value. An apparatus as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. An apparatus as in any above, wherein the tree structure comprises a skip list or a dynamic tree. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
An apparatus as in any above, wherein the apparatus arbitrates communication with the untrusted server unit on behalf of a plurality of users. An apparatus as in any above, wherein the memory comprises a shared storage unit that is also accessible by at least one other apparatus. An apparatus as in any above, wherein a storage requirement of the means for storing the digest remains substantially the same over time. An apparatus as in any of the above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. An apparatus as in any above, wherein the apparatus comprises a client electronic device.
(5) In another exemplary embodiment, and as illustrated in FIG. 6, a method comprising: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system (601 ); receiving from a client unit an instruction to perform an operation on the file system (602); and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure (603).
A method as above, further comprising: performing the operation on the stored file system. A method as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. A method as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation. A method as in any above, wherein transmission of the proof to the client unit is performed by an authentication service stored in a second memory accessible by the untrusted server unit.
A method as in any above, wherein the digest comprises only the cryptographic hash value. A method as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. A method as in any above, wherein the tree structure comprises a skip list or a dynamic tree. A method as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. A method as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name. A method as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. A method as in any above, wherein the method is implemented by a computer program. A method as in any above, wherein the method is implemented by a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising the steps of performing the method.
(6) In another exemplary embodiment, a program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing actions, the actions comprising: storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system; receiving from a client unit an instruction to perform an operation on the file system; and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure. A program storage device as above, the actions further comprising: performing the operation on the stored file system. A program storage device as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. A program storage device as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation. A program storage device as in any above, wherein transmission of the proof to the client unit is performed by an authentication service stored in a second memory accessible by the untrusted server unit.
A program storage device as in any above, wherein the digest comprises only the cryptographic hash value. A program storage device as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. A program storage device as in any above, wherein the tree structure comprises a skip list or a dynamic tree. A program storage device as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. A program storage device as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name. A program storage device as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
(7) In another exemplary embodiment, an apparatus comprising: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure. An apparatus as above, wherein the data processor is further configured to perform the operation on the file system. An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. An apparatus as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation. An apparatus as in any above, wherein transmission of the proof to the client unit is performed via an authentication service stored in a second memory accessible by the data processor. An apparatus as in any above, further comprising: a memory configured to store the file system.
An apparatus as in any above, wherein the digest comprises only the cryptographic hash value. An apparatus as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. An apparatus as in any above, wherein the tree structure comprises a skip list or a dynamic tree. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name. An apparatus as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. An apparatus as in any above, wherein the apparatus comprises an untrusted server.
(8) In another exemplary embodiment, an apparatus comprising: means for receiving from a client unit an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system; means for transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure. An apparatus as above, wherein the means for receiving comprises a receiver and the means for transmitting comprises a transmitter. An apparatus as in any above, wherein the means for receiving and the means for transmitting comprise a transceiver.
An apparatus as in any above, further comprising: means for performing the operation on the file system. An apparatus as in the previous, wherein the means for performing comprises a data processor. An apparatus as in any above, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation. An apparatus as in any above, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation. An apparatus as in any above, wherein transmission of the proof to the client unit is performed via an authentication service stored in a means for storage accessible by the apparatus. An apparatus as in any above, further comprising: means for storing the file system. An apparatus as in the previous, wherein the means for storing comprises a memory.
An apparatus as in any above, wherein the digest comprises only the cryptographic hash value. An apparatus as in any above, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv. An apparatus as in any above, wherein the tree structure comprises a skip list or a dynamic tree. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i-node number. An apparatus as in any above, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name. An apparatus as in any above, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value. An apparatus as in any above, wherein the apparatus comprises an untrusted server.
(9) In another exemplary embodiment, a system comprising: a client unit comprising: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor; and an untrusted server unit comprising: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
A system as above, further comprising one or more additional aspects of the exemplary embodiments of the invention as further described herein.
8. ADDITIONAL CONSIDERATIONS
Some exemplary embodiments of the invention provide an efficient way to authenticate an outsourced untrusted file system. In various exemplary embodiments, skip lists and dynamic trees are applied and an efficient hashing scheme may be used to represent the entire file system with a small, constant size digest. That is, the client maintains a constant size state of the entire file system. This state is an efficient representation both of the contents of the files and the hierarchy of the file system. This allows for applications where low-computing power and/or low-storage devices (e.g., sensors, smartcards, portable storage devices such as flash cards) are used to access an outsourced file system in a secure way. In addition, there is no authentication information on the client's side that is proportional to the size of the file system, something that is the case for most previous techniques. As further described, in various exemplary embodiments the set of query and update operations on the file system can efficiently be authenticated. The client authenticates these operations by receiving the verification or consistency proof from the server. Common and important file system operations, such as cd and Is, can be authenticated in logarithmic time.
Although described above primarily in reference to various UNIX commands, the exemplary embodiments of the invention are not limited solely to a UNIX system nor the identified UNIX commands, and may be utilized in conjunction with other suitable systems, commands or architectures.
Any use of the terms "connected," "coupled" or variants thereof should be interpreted to indicate any such connection or coupling, direct or indirect, between the identified elements. As a non-limiting example, one or more intermediate elements may be present between the "coupled" elements. The connection or coupling between the identified elements may be, as non-limiting examples, physical, electrical, magnetic, logical or any suitable combination thereof in accordance with the described exemplary embodiments. As non-limiting examples, the connection or coupling may comprise one or more printed electrical connections, wires, cables or any suitable combination thereof.
Generally, various exemplary embodiments of the invention can be implemented in different mediums, such as software, hardware, logic, special purpose circuits or any combination thereof. As a non-limiting example, some aspects may be implemented in software which may be run on a computing device, while other aspects may be implemented in hardware.
The foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the best method and apparatus presently contemplated by the inventors for carrying out the invention. However, various modifications and adaptations may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings and the appended claims. However, all such and similar modifications will still fall within the scope of the teachings of the exemplary embodiments of the invention.
Furthermore, some of the features of the preferred embodiments of this invention could be used to advantage without the corresponding use of other features. As such, the foregoing description should be considered as merely illustrative of the principles of the invention, and not in limitation thereof.

Claims

CLAIMSWhat is claimed is:
1. A method for a client unit to interact with a file system stored by an untrusted server unit, comprising: storing in a memory accessible by the client unit a digest representative of the file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure; issuing to the untrusted server unit an operation to be performed on the file system; and receiving a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the client unit.
2. A method as in claim 1 , further comprising: verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
3. A method as in claim 2, wherein the operation comprises a first operation, wherein the steps of issuing, receiving and verifying are performed for a second operation, wherein the second operation is issued only if the proof for the first operation is verified to be authentic.
4. A method as in claim 2 or 3, further comprising: in response to an unsuccessful verification, obtaining information comprising at least one of the operation that led to the unsuccessful verification and a current integrity state of the file system.
5. A method as in any one of claims 2-4, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation, the method further comprising: in response to determining that the proof is authentic, determining that the answer is authentic.
6. A method as in any one of claims 2-4, wherein the digest comprises a first digest and the operation comprises an update operation, the method further comprising: in response to determining that the proof is authentic, using the proof to compute a second digest and storing said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
7. A method as in any one of claims 1-4, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
8. A method as in any one of claims 1 -4, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
9. A method as in any one of claims 1-8, wherein the digest comprises only the cryptographic hash value.
10. A method as in any one of claims 1-9, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
11. A method as in any one of claims 1-10, wherein the tree structure comprises a skip list or a dynamic tree.
12. A method as in any one of claims 1-10, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i- node number.
13. A method as in any one of claims 1-10, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
14. A method as in any one of claims 1-13, wherein the client unit arbitrates communication with the untrusted server unit on behalf of a plurality of users.
15. A method as in any one of claims 1-14, wherein the memory comprises a shared storage unit that is also accessible by at least one other client unit.
16. A method as in any one of claims 1-15, wherein a storage requirement of the client unit for storing the digest remains substantially the same over time.
17. A method as in any one of claims 1-16, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
18. A method as in any one of claims 1-17, wherein the method is implemented by a computer program.
19. An apparatus comprising: a memory configured to store a digest representative of a file system, wherein a tree structure corresponds to the file system, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure, wherein the file system is stored by an untrusted server unit; a transceiver; and a data processor configured to issue to the untrusted server unit via the transceiver an operation to be performed on the file system, wherein the data processor is further configured to receive via the communication component a result and a proof in response to the operation, wherein the proof comprises information that enables re-computation of the digest by the data processor.
20. An apparatus as in claim 19, wherein the data processor is further configured to verifying the proof to determine if the proof is authentic, wherein said verification comprises utilizing the proof to compute a proof digest and comparing the computed proof digest with the stored digest to determine a correspondence.
21. An apparatus as in claim 20, wherein the digest comprises a first digest and the operation comprises an update operation, wherein the data processor is further configured, in response to determining that the proof is authentic, to use the proof to compute a second digest and to store said second digest in place of the first digest, wherein the second digest is representative of an updated file system comprising the file system after the update operation has been performed, wherein said second digest comprises a second cryptographic hash value over the updated file system that includes structural and balancing information for a second tree structure corresponding to said updated file system.
22. An apparatus as in any one of claims 19-21, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
23. An apparatus as in any one of claims 19-22, wherein the digest comprises only the cryptographic hash value.
24. An apparatus as in any one of claims 19-23, wherein the tree structure comprises a skip list or a dynamic tree.
25. An apparatus as in any one of claims 19-24, wherein a storage requirement of the memory for storing the digest remains substantially the same over time.
26. An apparatus as in any one of claims 19-25, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
27. An apparatus as in any one of claims 19-26, wherein the apparatus comprises a client electronic device.
28. A method compri sing : storing in a memory accessible by an untrusted server unit a file system, wherein a tree structure corresponds to the file system; receiving from a client unit an instruction to perform an operation on the file system; and transmitting to the client unit, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
29. A method as in claim 28, further comprising: performing the operation on the stored file system.
30. A method as in claim 28 or 29, wherein the operation comprises a query operation and wherein the result comprises an answer to the query operation.
31. A method as in claim 28 or 29, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
32. A method as in any one of claims 28-31 , wherein transmission of the proof to the client unit is performed by an authentication service stored in a second memory accessible by the untrusted server unit.
33. A method as in any one of claims 28-32, wherein the digest comprises only the cryptographic hash value.
34. A method as in any one of claims 28-33, wherein the operation comprises at least one UNIX command, wherein the at least one UNIX command comprises at least one of: cd, read, write, rm, mkdir, touch, Is, rm, rmdir and mv.
35. A method as in any one of claims 28-34, wherein the tree structure comprises a skip list or a dynamic tree.
36. A method as in any one of claims 28-34, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding i- node number.
37. A method as in any one of claims 28-34, wherein the tree structure comprises a skip list having a plurality of nodes, wherein each node is identified by a corresponding path name.
38. A method as in any one of claims 28-37, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
39. A method as in any one of claims 28-38, wherein the method is implemented by a computer program.
40. An apparatus comprising: a transceiver; and a data processor configured to receive from a client unit via the transceiver an instruction to perform an operation on a file system, wherein the apparatus is configured to access the file system, wherein a tree structure corresponds to the file system, wherein the data processor is further configured to transmit to the client unit via the transceiver, in response to the instruction, a result and a proof, wherein the proof comprises information that enables re-computation of a digest by the client unit, wherein the digest comprises a cryptographic hash value over the tree structure that includes structural and balancing information for the tree structure.
41. An apparatus as in claim 40, wherein the data processor is further configured to perform the operation on the file system.
42. An apparatus as in claim 40 or 41, further comprising: a memory configured to store the file system.
43. An apparatus as in any one of claims 40-42, wherein the operation comprises an update operation and wherein the proof further comprises structural information necessary to perform the update operation.
44. An apparatus as in any one of claims 40-43, wherein the digest comprises only the cryptographic hash value.
45. An apparatus as in any one of claims 40-44, wherein the tree structure comprises a skip list or a dynamic tree.
46. An apparatus as in any one of claims 40-45, wherein the cryptographic hash value comprises a collision-resistant cryptographic hash value.
47. An apparatus as in any one of claims 40-46, wherein the apparatus comprises an untrusted server.
PCT/US2007/024642 2006-11-30 2007-11-30 Authentication for operations over an outsourced file system stored by an untrusted unit WO2008147400A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US86179406P 2006-11-30 2006-11-30
US60/861,794 2006-11-30

Publications (1)

Publication Number Publication Date
WO2008147400A1 true WO2008147400A1 (en) 2008-12-04

Family

ID=40075406

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/024642 WO2008147400A1 (en) 2006-11-30 2007-11-30 Authentication for operations over an outsourced file system stored by an untrusted unit

Country Status (1)

Country Link
WO (1) WO2008147400A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8594329B2 (en) 2010-12-17 2013-11-26 Microsoft Corporation Non-interactive verifiable, delegated computation
GB2529246A (en) * 2014-08-15 2016-02-17 Ibm Method for securing integrity and consistency of a cloud storage service with efficient client operations
WO2018005403A1 (en) * 2016-06-30 2018-01-04 Microsoft Technology Licensing, Llc Controlling verification of key-value stores
EP4244750A4 (en) * 2022-07-01 2024-03-13 Space And Time Labs Inc Methods for verifying database query results and devices thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030088783A1 (en) * 2001-11-06 2003-05-08 Dipierro Massimo Systems, methods and devices for secure computing
US20040243816A1 (en) * 2003-05-30 2004-12-02 International Business Machines Corporation Querying encrypted data in a relational database system
US20040250113A1 (en) * 2003-04-16 2004-12-09 Silicon Graphics, Inc. Clustered filesystem for mix of trusted and untrusted nodes
US20050091261A1 (en) * 2003-10-02 2005-04-28 Agency For Science, Technology And Research Method for incremental authentication of documents

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030088783A1 (en) * 2001-11-06 2003-05-08 Dipierro Massimo Systems, methods and devices for secure computing
US20040250113A1 (en) * 2003-04-16 2004-12-09 Silicon Graphics, Inc. Clustered filesystem for mix of trusted and untrusted nodes
US20040243816A1 (en) * 2003-05-30 2004-12-02 International Business Machines Corporation Querying encrypted data in a relational database system
US20050091261A1 (en) * 2003-10-02 2005-04-28 Agency For Science, Technology And Research Method for incremental authentication of documents

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8594329B2 (en) 2010-12-17 2013-11-26 Microsoft Corporation Non-interactive verifiable, delegated computation
GB2529246A (en) * 2014-08-15 2016-02-17 Ibm Method for securing integrity and consistency of a cloud storage service with efficient client operations
US9589153B2 (en) 2014-08-15 2017-03-07 International Business Machines Corporation Securing integrity and consistency of a cloud storage service with efficient client operations
WO2018005403A1 (en) * 2016-06-30 2018-01-04 Microsoft Technology Licensing, Llc Controlling verification of key-value stores
US10396991B2 (en) 2016-06-30 2019-08-27 Microsoft Technology Licensing, Llc Controlling verification of key-value stores
EP4244750A4 (en) * 2022-07-01 2024-03-13 Space And Time Labs Inc Methods for verifying database query results and devices thereof

Similar Documents

Publication Publication Date Title
Xu et al. Slimchain: Scaling blockchain transactions through off-chain storage and parallel processing
Xu et al. vchain: Enabling verifiable boolean range queries over blockchain databases
CN110915166B (en) Block chain
US7974221B2 (en) Efficient content authentication in peer-to-peer networks
US11283616B2 (en) Method for index-based and integrity-assured search in a blockchain
Papamanthou et al. Time and space efficient algorithms for two-party authenticated data structures
Hu et al. Spatial query integrity with voronoi neighbors
EP2338127B1 (en) Cryptographic accumulators for authenticated hash tables
JP2021533448A (en) Systems and methods to support SQL-based rich queries in hyperlegger fabric blockchain
CA2731954C (en) Apparatus, methods, and computer program products providing dynamic provable data possession
Goodrich et al. Athos: Efficient authentication of outsourced file systems
Zheng et al. Efficient query integrity for outsourced dynamic databases
Li et al. Integrity-verifiable conjunctive keyword searchable encryption in cloud storage
Papadopoulos et al. Practical authenticated pattern matching with optimal proof size
Goodrich et al. Efficient verification of web-content searching through authenticated web crawlers
Fernando et al. SciBlock: A blockchain-based tamper-proof non-repudiable storage for scientific workflow provenance
US20130024907A1 (en) Integrating sudo rules with entities represented in an ldap directory
Zhang et al. Integrity authentication for SQL query evaluation on outsourced databases: A survey
Hong et al. Gridb: scaling blockchain database via sharding and off-chain cross-shard mechanism
Heitzmann et al. Efficient integrity checking of untrusted network storage
Belyaev et al. On the design and analysis of protocols for personal health record storage on personal data server devices
WO2008147400A1 (en) Authentication for operations over an outsourced file system stored by an untrusted unit
Tamassia et al. Efficient content authentication in peer-to-peer networks
Zhang et al. CorrectMR: Authentication of distributed SQL execution on MapReduce
Tang et al. Reputation audit in multi-cloud storage through integrity verification and data dynamics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07875045

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07875045

Country of ref document: EP

Kind code of ref document: A1