15    Serialization/Deserialization

15.1   Overview

Occasionally, bulk data movement is needed between, into, or out of clouds. Cloud serialization operations provide a means to normalize data to a canonical, self-describing format, which includes:

   data migration between clouds,

   data migration during upgrades (or replacements) of cloud implementations, and

   robust backup.

The canonical format of serialized data describes how the data is to be represented in a byte stream. As long as this byte stream is not altered during the transfer from source to destination, the data may be reconstituted on the destination system.

15.2   Exporting Serialized Data

A canonical encoding of the data is obtained by creating a new data object and specifying that the source for the creation is to serialize a given CDMI™ data object, container, or queue. On a successful serialization, the result shall be a data object that is created with the serialized data as its value. If a container has an exported block protocol, the serialized data may contain the block-by-block contents of that container along with its metadata.

The resulting data object that is produced is the canonical representation of the selected data object, container and children, or queue.

   If the source specified is a data object, the canonical format shall contain all data object fields, including the value, valuetransferencoding, and metadata fields.

   If the source being specified is a queue, the canonical format shall contain all queue fields, including the value and valuetransferencoding fields of enqueued items, along with the metadata of the queue itself.

   If the source being specified is a container, the canonical format shall contain all container fields, recursively, including all children of the container. If a user attempts to serialize a container that includes children that the user, who is performing the serialization operation, does not have permission to read, these objects shall not be included in the resulting serialized object.

When performing a serialization operation, objects shall only be included if the principal initiating the serialization has sufficient permissions to read those objects.

15.3   Importing Serialized Data

Canonical data may be deserialized back into the cloud by creating a new data object, container object, or queue object and by specifying that the source for the creation is to deserialize a given CDMI data object or by specifying the serialized data in base 64 encoding in the deserializevalue field.

The destination may or may not exist previously. If not, a create operation is performed. If a container already exists, an update operation with serialized children shall update the container and all children. If the serialized container object does not contain children, only the container object is updated. Data objects are recreated as specified in the canonical format, including all metadata and the data object ID.

   If the user who is deserializing a serialized data object has the "cross_domain" privilege and has not specified a domainURI as part of the deserialize operation, the original domainURIs from the serialized object shall be used. If any of the specified domainURIs are not valid in the context of the storage system on which the deserialization operation is being performed, the entire deserialize operation shall fail.

   If the user who is deserializing a serialized object specifies a domainURI as part of the deserialize operation, the domainURI of every object being deserialized shall be set to the specified domainURI. To specify a domainURI other than the domainURI of the parent, the user shall have the cross_domain privilege. If the user does not have the cross_domain privilege and specifies a domainURI other than the domainURI of the parent, a 400 Bad Request response shall be returned.

   If the user who is deserializing a serialized object does not specify a domainURI and does not have the "cross_domain" privilege, then the deserialization operation shall only be successful if all objects have the same domainURI as the parent object on which the deserialization operation is being performed.

Deserialization operations shall restore all metadata from the specified source. If the original provider of the serialized data-supported vendor extensions is through custom metadata keys and values, then these customized requirements shall be restored when deserialized. However, the custom metadata keys and values may be treated as user metadata (preserved, but not interpreted) by the destination provider. Preservation allows custom data requirements to move between clouds without losing this information.

15.3.1   Canonical Format

The canonical format shall represent specified data objects and containers, as they exist within the storage system. Each object shall be represented by the metadata for the object, identifiers, and the data stream contents of the data object. Because metadata is inherited from enclosing containers, all parent metadata shall be represented in the canonical format (essentially flattening the hierarchy). To preserve the actual metadata values that apply to the data object that is being serialized, the non-overridden metadata is included from both the immediate parent container of the specified object and from the parent of each higher-level container.

The canonical format shall have the following characteristics:

   recursive JSON for the data object, consistent with the rest of CDMI;

   user and data system metadata for each data object/container;

   data stream contents for each data object and queue;

   binary data represented using escaped JSON strings; and

   typing of data values consistent with CDMI JSON representations.

15.3.2   Example JSON Canonical Serialized Format

EXAMPLE   In this example, a data object and a queue in a container have been selected for serialization:

{

   "objectType": "application/cdmi-container",

   "objectID": "00007E7F00102E230ED82694DAA975D2",

   "objectName": "MyContainer/",

   "parentURI": "/",

   "parentID": "00007E7F0010128E42D87EE34F5A6560",

   "domainURI": "/cdmi_domains/MyDomain/",

   "capabilitiesURI": "/cdmi_capabilities/container/",

   "completionStatus": "Complete",

   "metadata": {},

"exports" : {

       "OCCI/iSCSI": {

       "identifier": "00007E7F00104BE66AB53A9572F9F51E",

       "permissions": [

           "http://example.com/compute/0/",

           "http://example.com/compute/1/"

       ]

   },        "Network/NFSv4" : {

           "identifier" : "/users",

            "permissions" : "domain"

        }

   },

 

"childrenrange" : "0-1",

   "children" : [

       {

           "objectType" : "application/cdmi-object",

           "objectID" : "0000706D0010B84FAD185C425D8B537E",

           "objectName" : "MyDataObject.txt",

           "parentURI" : "/MyContainer/",

            "parentID" : "00007E7F00102E230ED82694DAA975D2",

           "domainURI" : "/cdmi_domains/MyDomain/",

           "capabilitiesURI" : "/cdmi_capabilities/dataobject/",

           "completionStatus" : "Complete",

           "mimetype" : "text/plain",

           "metadata" : {

                

           },

           "valuerange" : "0-36",

           "valuetransferencoding": "utf-8",

            "value" : "This is the Value of this Data Object"

       },

{

           "objectType" : "application/cdmi-queue",

           "objectID" : "00007E7F00104BE66AB53A9572F9F51E",

           "objectName" : "MyQueue",

           "parentURI" : "/MyContainer/",

            "parentID" : "00007E7F00102E230ED82694DAA975D2",

           "domainURI" : "/cdmi_domains/MyDomain/",

           "capabilitiesURI" : "/cdmi_capabilities/queue/",

           "completionStatus" : "Complete",

           "metadata" : {

                

           },

 

           "queueValues" : "0-1",

           "mimetype": [

              "text/plain",

              "text/plain"

          ],

          "valuetransferencoding": [

              "utf-8",

              "utf-8"

          ],"valuerange" : [

               "0-2",

                "0-3"

           ],

           "value" : [

               "red",

                "blue"

            ]

        }

    ]

}

To allow efficient deserialization in stream mode when serializing containers to JSON, the children array should be the last item in the canonical serialized JSON format.