• Conference
  • CESI - Hors LINEACT
  • Engineering and Numerical Tools

Cost-Efficient Big Intermediate Data Placement in a Collaborative Cloud Storage Environment

Conférence : Communications avec actes dans un congrès international

Collaborative cloud storage environment, which share resources of multiple geographically distributed datacenters owned by different providers enable scientific workflow from different locations to process large scale big intermediate data through the Internet. Distributed datacenters are federated and each member can collaborate with each other to efficiently share and process the intermediate data from distributed workflow instances. This paper focuses on the storage cost minimization of intermediate data placement in federated cloud datacenters. Through collaborative and federation mechanisms, we propose an exact federation data placement algorithm based on integer linear programming model (ILP) to assist multiple datacenters hosting intermediate data files generated from a scientific workflow. Under the constraints of the problem, the proposed algorithm finds an optimal intermediate data placement with a cost saving over the federated cloud datacenters, taking into account scientific user requirements, data dependency and size. Experimental results show the cost-efficiency of the proposed cloud storage federation algorithm