Supporting Autonomic Management of Clouds: Service Clustering With Random Forest

A promising solution for the management of services in clouds, as fostered by autonomic computing, is to resort to self-management. However, the obfuscation of underlying details of services in cloud computing, also due to privacy requirements, affects the effectiveness of autonomic managers. Data-driven approaches, in particular those relying on service clustering based on machine learning techniques, can assist the autonomic management and support decisions concerning, e.g., the scheduling and deployment of services. Unfortunately, applying such approaches is further complicated by the coexistence of different types of data within the information provided by the monitoring of cloud systems: both continuous (e.g., CPU load) and categorical (e.g., VM instance type) data are available. Current approaches deal with this problem in a heuristic fashion. In this paper, instead, we propose an approach that uses all types of data, and learns in a data-driven fashion the similarities and patterns among the services. More specifically, we design an unsupervised formulation of random forest to calculate service similarities and provide them as input to a clustering algorithm. For the sake of efficiency and to meet the dynamism requirement of autonomic clouds, our methodology consists of two steps: 1) off-line clustering and 2) on-line prediction. Using datasets from real-world clouds, we demonstrate the superiority of our solution with respect to others and validate the accuracy of the on-line prediction. Moreover, to show applicability of our approach, we devise a service scheduler that uses similarity among services, and evaluate its performance in a cloud test-bed using realistic data.

Supporting Autonomic Management of Clouds: Service Clustering With Random Forest

Uriarte, Rafael Brundo;TIEZZI, Francesco;Tsaftaris, Sotirios A.

2016-01-01

Abstract

A promising solution for the management of services in clouds, as fostered by autonomic computing, is to resort to self-management. However, the obfuscation of underlying details of services in cloud computing, also due to privacy requirements, affects the effectiveness of autonomic managers. Data-driven approaches, in particular those relying on service clustering based on machine learning techniques, can assist the autonomic management and support decisions concerning, e.g., the scheduling and deployment of services. Unfortunately, applying such approaches is further complicated by the coexistence of different types of data within the information provided by the monitoring of cloud systems: both continuous (e.g., CPU load) and categorical (e.g., VM instance type) data are available. Current approaches deal with this problem in a heuristic fashion. In this paper, instead, we propose an approach that uses all types of data, and learns in a data-driven fashion the similarities and patterns among the services. More specifically, we design an unsupervised formulation of random forest to calculate service similarities and provide them as input to a clustering algorithm. For the sake of efficiency and to meet the dynamism requirement of autonomic clouds, our methodology consists of two steps: 1) off-line clustering and 2) on-line prediction. Using datasets from real-world clouds, we demonstrate the superiority of our solution with respect to others and validate the accuracy of the on-line prediction. Moreover, to show applicability of our approach, we devise a service scheduler that uses similarity among services, and evaluate its performance in a cloud test-bed using realistic data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2016
			
	Rivista
	
				IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TNSM.2016.2569000
			
	ID tipologia loginMiur
	
				262
			
	Appare nelle tipologie:
	
				1.1 Articolo

File in questo prodotto:

File	Dimensione	Formato
tiezzi.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Non specificato Dimensione 1.6 MB Formato Adobe PDF Visualizza/Apri	1.6 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11581/400045

Citazioni

ND

19

12

social impact