Previous Topic: Performing Process Logic AnalysisNext Topic: Basic Interaction Clustering Concepts


Using Interaction Clustering to Refine Project Scope

Cluster analysis is a technique for grouping things based on some set of common characteristics.

In analysis, clustering is used to group elementary processes based on their use of data. The main technique is clustering based on expected effects, using CRUD (Create, Read, Update, and Delete) values for the interactions between processes and entity types.

After clustering, each cluster represents a set of data and activities that should be handled by one integrated system because it includes all the processes that create or modify a set of tightly related data.

This technique is particularly suited to developing a set of procedures that together can maintain closely related data as a service to a wide range of applications that can then exploit that data.

Confirming and refining the scope of business systems to be implemented may be critical in a large development project that will implement many systems or that must progressively implement support for groups of processes. Cluster analysis helps to ensure that each group of elementary processes as it is implemented has available the entity types that are needed.

If you follow the principles of parallel decomposition of data and activities, you should find that there is a close correspondence between groupings of entity types into subject areas, and elementary processes into higher level activities. See the chapter "Building the Analysis Model." In such a case, it may not be necessary to perform any cluster analysis. The groupings are functionally what the scope of the development project requires. If it becomes useful to explore boundary issues to determine where to place an entity type or elementary process, you can perform cluster analysis as a confirmation technique.

CA Gen supports techniques for scoping in analysis based on grouping entity types and activities into clusters of tightly interacting model objects in a matrix. The result of this grouping is represented in the clustered Entity Type/Elementary Process Matrix.

CA Gen supports both the recording and clustering of this matrix. The cell values with which this matrix is populated are interactions between activities and data. These values either derive from the expected effects defined for elementary processes or may be entered directly into the matrix. CA Gen clusters, as well as possible, based upon the currently recorded interactions.

Often you have additional background information. Repeated validation and correction cycles using user-selected parameters allow you to explore alternatives and to identify and correct errors both in defining interactions and in handling the Entity/Activity Matrix.

The final step requires human intelligence to tidy up and validate the final clustered result. The technique therefore includes final manual definition of business areas and business systems after initial automatic clustering.