We describe a decentralized and online mechanism for the proactive self-monitoring of distributed systems. This research targets networked systems in which individual nodes can monitor their operational status and represent it using sets of globally known attributes. Further, nodes periodically publish this status as semantic events that contain a sequence of attribute-value pairs.
Each one of these events is represented as a point in a multidimensional space, as shown in Fig. 2. Each dimension in this space, referred to as an information space, corresponds to one of the event attributes, and the location of a point within the space is determined by the values for each of the attributes within the particular event. A distance function can be defined for each attribute, and by extension for the multidimensional information space, in order to measure the similarity between elements (i.e. similarity is inversely proportional to distance). Conceptually, a cluster is a set of points for which mutual distances are relatively smaller than the distances to other points in the space. Our approach for cluster detection, however, is based on evaluating the relative density of points within the space (in this case, point similarity is directly proportional to point density).
In order to evaluate point density, the information space is divided into regions, and the number of points within each region is observed by an individual processing node. If the total number of points in the information space is known, then a baseline density for a uniform distribution of points can be calculated and used to estimate an expected number of points per region (Fig. 3a). Clusters are recognized within a region if the region has a relatively larger point count than this expected value, as in Fig. 3b. Conversely, if the point count is smaller than expected, then these points may potentially be anomalies. However, clusters may cross region boundaries, as shown in the lower left quadrant of Fig. 3b, and this must be taken into account when considering potential anomalies.