A novel approach for large-scale environmental data partitioning on cloud and on-premises storage for compute continuum applications

Gennaro Mellone , Ciro Giuseppe De Vita, Dante D. Sanchez-Gallegos, Genaro Sanchez-Gallegos, Catherin A. Torres-Charles, Javier Garcia- Blas, Jesús Carretero Pérez, J.L.Gonzalez-Compean and Giuliano Laccetti.

 

Abstract

Cloud-based services have proved useful in several research fields, such as engineering, health science, and astrophysics, to mention a few examples. The computational environmental science community developed a strong need for cloud facilities to store, process, and manage data from observations and numerical models for simulations and forecasts. Weather forecast models and global sensor networks deal with multidimensional geo-referenced data∖sets. However, environmental data consumer applications usually require a relatively small amount of multidimensional input data slice to analyze a specific area or time interval. Hence, reducing data dimension for information retrieval is mandatory. This paper presents a twofold solution: a technique to load and retrieve the sliced multidimensional data set on different cloud services such as Amazon Web Service (AWS), Google Cloud Platform, and Microsoft Azure. The experimental results performed on these cloud services highlight that the proposed method can significantly speed up the process of loading and retrieving the data slices compared to working with the entire data set in bulk or OPeNDAP server.

 

https://doi.org/10.1002/cpe.7893

Print
Orden de presentación (texto):2023, 08
Cinvestav © 2024
11/11/2024 01:41:23 p. m.