Olusola:

BACKGROUND TO THE STUDY

Washizaki et al.[1] defined a software component as a unit of composition with contractually specified interface and explicit context dependencies only. He said a software component can be deployed independently and is subjected to composition by third party. A component can be a coherent package of software that can be independently developed and delivered as a unit and that offers interfaces by which it can be connected unchanged with other components to compose a larger system.[2] A software component is a reusable piece of code or software in binary form, which can be plugged into components from other vendors with relatively little efforts. They are black box entities that encapsulate services behind well-defined interfaces, which tend to be very restricted in nature, reflecting a particular model of plug compatibility supported by a component framework, rather than being very rich and reflecting real-world entities of the application domain.[3]

Component-based software development (CBSD) is a development approach in which systems are built from well-defined, independently produced pieces by combining the pieces with self-made components.[4] CBSD is a paradigm that aims at constructing and designing systems using a predefined set of software components explicitly created for reuse [Figure 1]. Component-based systems achieve flexibility by clearly separating the stable parts of the system (that is, the components) from the specification of their composition.[3]

Figure 1: Schematic view of component-based software development Approach (Kaur and Singh, 2013)

Reusability is the degree to which a software component can be reused.[1,5] This consequently leads to reduced software development cost and less development time as it enables less writing but more of assembly. Reusability plays an important role in CBSD and also acts as the basic criterion for evaluating component [Figure 2]. Kumar et al.[6] asserted that reusability of a component is an important aspect, which gives the assessment to reuse the existing developed component, thereby reducing the risk, cost, and time of software development. If a component is not reusable, then the whole concept of CBSD fails.[7] Reusability is one of the quality attributes of CBSD. It can measure the degree of features/components that are reused in building similar or different new software with minimal change.[8] To realize the reuse of components effectively, reusability estimation has to be carried out. For systematic reuse process, the use of metrics is very germane. Without metrics, evaluating the quality and qualification of the selected components for reuse becomes an uphill task.[8] Goel and Sharma[9] defined reusability as the quality of any software component to be used again with slight or no modification. Software reuse is the process of creating software systems from existing software assets rather than building them from scratch. Reusability was also viewed as the quality factor of software that qualifies it to be used again in another application, be it partially modified or completely modified. In other words, software reusability is a measure of the ease with which previously acquired concepts and objects can be used in new contexts. Kumar et al.[10] seen reusability of a component as an important aspect, which gives the assessment to reuse the existing developed component. Singh and Tomar[8] viewed reusability as a physical replaceable part of a system that adds functionality to the system, through the realization of a set of interfaces. The components having well-defined interfaces can be considered good for reuse. The interfaces have strong significance in context of reusability of components Figure 3.

Figure 2: Component reusability tree – based on the study’s identified metrics, where: COCU: Component customizability, COIC: Component interface complexity, CORE: Completeness of component return, COUS: Component understandability, COST: Component stability

Figure 3: Study methodology

Metrics, however, play an indispensable role in the successful evaluation of software component reusability. According to Washizaki et al.[1], it is necessary to measure reusability of software components to realize the effective reuse of such components. According to the author, metrics are used to determine quality factors that affect reusability. A component alone has certain characteristics that tend to affect its reusability. Quality factors are chosen to provide an analysis of the reusability of a component. The choice of factors affecting reusability is considered based on activities carried out while reusing the components.

Unlike in the past, where researchers employed statistical methods of predicting reusability[1,11], recent interdisciplinary techniques such as fuzzy logic, artificial neural network (ANN), and neuro-fuzzy have taken the lead due to their power of predictability.[6,9,12,13] This work investigates the works of Kumar et al. Sharma et al. and Sagar et al.[6,12,13] and Goel and Sharma,[9] who all adopted soft computing approach to predict reusability of software component, but with varying degree of accuracy. The problem of applying a method that yields the best accuracy level and the need to establish stability (in the context of volatility) as a factor for determining component reusability motivated this study to lend a voice to the domain of component reusability. This research work presents a genetic-fuzzy system (GFS) with stability in the context of volatility. The result of the work will be compared with the result obtained using adaptive neuro-FIS (ANFIS) method since researches have shown that ANFIS predicts more accurately than ANN and FIS.[6,9]

RELATED WORK

Researchers have adopted the use of statistical approaches like correlation analysis, while some made use of soft computing techniques such as ANN and fuzzy logic to evaluate component reusability.

Washizaki et al.[1] applied statistical method to component reusability assessment issue. Metric suites for measuring reusability of software components were developed. In implementing the work, component overall reusability model was developed to assess and evaluate Java web components. The study proposed three quality factors as criteria for measuring reusability characteristic, while five metrics were deployed for the measurement. The factors are understandability, adaptability, and portability, while the metrics include existence of metainformation, Rate of Component Observability (observability) – for measuring understandability, Rate of Component Customizability (customizability) – for measuring adaptability, Self-completeness of Component’s Return Value, and Self-completeness of Component’s Parameter – for measuring portability. The result of the analysis conducted using 125 Java web components from www.jars.com shows that the proposed metrics were suitable. However, the empirical study was limited to evaluation with Java beans components; as other component technologies such as. Net and ActiveX were not explored for further validation.

Rotaru and Dobre[14] addressed reusability from the perspective of adaptability, composability, and complexity metrics. The work aimed to cover the main aspects of reusable software components, which in their opinion are composability and adaptability. Both factors were evaluated based on the complexity of the component interface. The major contribution of the work, which adopted qualitative approach, was the formulation of metrics and design of a mathematical model for practical assessment of the specified software component characteristics. The proposed model is, however, required to be validated by assessing several software components based on it.

Sharma et al.[12] contributed largely to software component reusability works by proposing an ANN soft computing-based approach to assess the reusability of software components. The work aimed at aiding developers to select the best component in terms of its reusability. In their research, four factors, on which reusability of components depends, were identified. These are customizability, interface complexity, portability, and understandability. The empirical work was carried out with 40 components collected from www.jars.com and www.elegantjbeans.com. Applying ANN soft computing approach, network is trained on training data by considering different number of hidden neurons for two training functions, namely trainlm and trainbr, to get the best results. This network was further validated by applying the proposed approach on test data. The adaptation learning function selected for the experiment was “learngdm.” Performance function used was root-mean-square error (RMSE), with “tan-sigmoid” as the transfer functions in both layers. Results obtained showed that the network was able to predict the reusability of components with optimum performance and with an RMSE of 0.1348 using trainlm as the training function. The limitation of the work was in the limited number of data used to train the network. It was submitted that using more number of components may produce better results/accuracy for the training and testing.

Sagar et al.[13] discussed reusability in relation to component-based development (CBD) and proposed a reusability metrics for black-box components. It identifies factors affecting reusability as customizability, interface complexity, portability, and document quality. In the study, fuzzy logic-based approach was used to estimate the reusability of components using triangular membership functions. The authors used two classroom-based Java beans components, namely Calculator and Chart B for validation. Reusability values of 0.71 and 0.3124 were arrived at proving that FIS is able to predict reusability of components with an acceptable level of accuracy. Further, it was submitted that the adopted approach can be validated against other approaches for estimated reusability of components.

Singh and Toora[15] applied neuro-fuzzy technique on a case study which they took from a reputed journal. The case study was concerned with the reusability of software components. The reusable components/attributes were coupling, complexity, volume regularity, and reuse frequency. They proved that neuro-fuzzy model yields less percentage average error as compared to standalone fuzzy logic and neural network. It also produces greater accuracy for software reusability as compared to FIS and ANN.

Kamalraj et al.[16] proposed concept of “stability-based clustering” method focusing on “stability” metric of component(s) and “clustering analysis” of data mining. Data mining technique that may help to maintain the reuse repository with quality reusable components was proposed. The data mining was used for analyzing bulk of data to extract the knowledge from them. Applying data mining on software engineering to simplify the data handling results in reduced efforts and cost in various aspects. Data mining was given very effective approach like “clustering analysis” to group the elements as per the required data item. Stability is an essential factor to represent the kind of dependency among components and communication among the components and their interior elements. Hence, by applying “stability-based reuse component repository,” it can help the total system development with higher productivity in a very short period. In the research, stability was only introduced to track the type of dependency among components, communication among them and their interior elements; thus, stability was not used to determine the level of reusability of components.

In Jatain and Gaur,[17] emphasis was laid on the estimation of reusability of components by identifying some quality attributes of components which influence reusability. The five identified factors are customizability, configurability, interface complexity, portability, and compatibility. Fuzzy logic was employed as the soft computing approach adopted to test for reusability of four components. The approach was used to estimate reusability of some real-time projects. This result showed that to enhance the reusability factor of component, its customizability, configurability, compatibility, and portability should be high, whereas interface complexity should be low. However, the study’s limitation hinges on further validation of the approach used.

Ravichandran et al.[18] developed an automated process of component selection using ANFIS-based technique using 14 reusable component’s parameters. Neuro-fuzzy-based approach was adopted to select optimal reusable components efficiently. The developed approach was validated with three data sets for three proposed software architectures. The results showed that the proposed approach was able to predict the reusability of these components with an acceptable accuracy. However, stability was used as a fuzzy input with variables such as low, medium, and high in the ANFIS structure, without reference to porting of the components as suggested in their definition.

Christopher and Chandra[19] proposed a multicriteria fuzzy-based approach for predicting software requirement stability based on complexity point measurement and for finding out the complexity weight based on requirement complexity attributes such as functional requirement complexity, non-functional requirement complexity, input-output complexity, interface, and file complexity. The research paper discussed the importance of measuring the requirements changes for the lack of instability in the requirements. The prediction model for requirements stability approach provides the solution for measuring the requirements changes based on the complexity point measurement model. The work, however, did not justify nor demonstrate the applicability of the model for developing maintenance and transition projects based on different complexity attributes and different adjustment factors.

Aversano[20] provided the subset of the architectural components of the software project that could be actually reused. The paper presented an empirical study aimed at assessing software architecture stability and its evolution along the software project history. The study entailed the gathering and analysis of relevant information from several open source projects. The paper evaluated the stability of the core architecture during the development cycle of each software project, by adopting two metrics defined in the initial stage of the process. The analysis performed considered software systems developed using different paradigms, with different evolution trends and concerning different application domains. The work described in the paper was basically devoted to the study of the stability of the architectural core of a software project with the aim of understanding the potential reusability of their software component. The study only handled only stability measurement as it is related to architectural level of software leaving other aspects in which stability can be applied.

Kumar et al.,[6] however, adopted multidisciplinary technique of ANFIS in the assessment of component reusability. In the study, four dependent factors, namely customizability, interface complexity, understandability, and portability, were used to estimate the reusability of software components. The result obtained using ANN approach and using data from Sharma et al. (2009) was an RMSE of 0.1852. Applying ANFIS approach to the same set of data yielded an RMSE of 0.1695, which shows that neuro-fuzzy gives a better and more accurate reusability result. The comparative analysis of the proposed ANFIS and the existing ANN was carried out on 48 Java components. It was, however, opined that accuracy of the used method is subject to availability of substantial number of data/components.

Goel and Sharma[9] taken into account three different factors for determining reusability of software components and then proposed a model for reusability assessment using the ANFIS. The quality factors used include coupling, complexity, and portability. The experiment used 338 records retrieved from open source produced an RMSE of 0.042482. It was suggested that new factors such as understandability, cohesion, clarity, and generality can also be added, and the cumulative effect of those factors can be seen on the future predictions. Furthermore, different techniques can be used other than ANFIS to predict reusability such as support vector machine. Finally, it was submitted that a much better generalized approach is expected if real-time data are considered.

Singh and Tomar[8] identified four attributes for estimating reusability of black-box components. The reusability metric was parameterized using component interface complexity, component understandability, component customizability, and component reliability. The project made use of file upload component of the Apache Commons project. The work proved that the proposed metrics were able to determine reusability. It was, however, submitted that the work requires further validation, suggesting that the weight values for the estimation of reusability be adjusted using neural networks.

Ekanem and Woherem[21] presented techniques for assessing the stability of components extracted from legacy applications using software maturity index. The research presents a technique for assessing the stability of components extracted from legacy applications using software maturity index. The practical demonstration of the approach was based on maintenance data generated with RANDBETWEEN function of spreadsheet package on three legacy applications used in the demonstration. The research work was designed as experimental research with the following processes: (i) Review of relevant documentation, (ii) randomization of the needed research data using RANDBETWEEN function in spreadsheet program, (iii) data coding and analysis, and (iv) results interpretation and discussions. The ranking scheme comprises the following ordered items, highly stable, fairly stable, stable, unstable, fairly unstable, and highly unstable. However, stability of legacy components was measured using maturity index but with no recourse to the reusability of the component.

The works of Kumar et al.[6] and Goel and Sharma[9] surpassed others in terms of result accuracy as a result of the approach used (ANFIS). Goel and Sharma,[9] however, call for the validations of the various results using different approaches and experimenting with components other than Java components. This work, therefore, researches into these noticeable gaps as a way of contributing to works on component reusability assessment.

METHODOLOGY FOR THE STUDY

This study adopts:

CBD approach: This methodology helps to build component analysis tool for accessing common software components;
Metric-based approach: This methodology aids to measure the degree to which a component is reusable Table 1;
Soft-computing approach: This methodology predicts the certainty for reusability.

Table 1: Metrics and the quality factors they measure

The following procedures were followed in ensuring a successful implementation of the work:

Commercial off-the-shelf software components were extracted from the third-party software vendors. According to Sharma et al.,[4] the key to the success of CBSD is its ability to use software components that are often developed by and purchased from the third party.

Component Data Extraction

Sixty-nine software components were gotten from four different third-party component development organizations (www.elegantjbeans.com, www.jidesoft.com, www.math.hws.edu, and www.codeproject.com). Table 2 shows the sources, nature, and numbers of the components.
Appropriate metrics for each quality factor that qualifies the characteristic, reusability, were applied. We consider the same quality factors as used by the duo of Sharma et al.[12] and Kumar et al.,[6] with stability (in the context of volatility) as an addendum Table 2.
Genetic-fuzzy soft computing approach was deployed for evaluating the level of reusability of the selected components. GFS is a system that exploits genetic algorithms to automatically generate or optimize the knowledge base of a fuzzy system since the fuzzy system is not able to learn on its own. Researches have shown that hybridized genetic algorithm gives a more accurate predictive result.[22-25]

Table 2: Components used

DESIGN

Adapting Kumar et al. (2013) approach and establishing the need for stability as a factor for component reusability measurement,

Let R_cn=F_cn[X_n, Y_n, Z_n, J_n, K_n] (1)

Where:

R_cn is the reusability of component.

F_cn is implemented using genetic fuzzy with X_n, Y_n, Z_n, J_n, and K_n as input-dependent variables, representing customizability (component customizability), interface complexity (component interface complexity), portability (completeness of component return), understandability (component understandability), and stability (component stability), respectively.

In the proposed model, GFS is developed, trained, and tested using MATLAB software. The steps involved in the development of the system [Figure 4] are as follows:

Figure 4: Component reusability prediction model

Extract component data
Compute the metric value of X_n, Y_n, Z_n, J_n, and K_n
Represent the variables in Fuzzy format
Load values of X_n, Y_n, Z_n, J_n, and K_n into fuzzy toolbox
Apply the Genetic Optimizer to tune the knowledge base
Compute the fitness value until the threshold/termination is reached.

The detailed model is presented in Figure 5

Figure 5: Detailed genetic-fuzzy model for component reusability prediction

ORGANIZATIONAL STRUCTURE

The operational structure of the GFS for component reusability prediction was constructed using UML (use case, sequence, and activity diagrams) to describe the logical design that is implementation-independent design of the system. This shows the system’s components and their relationships as it appears from user’s inputs to processing of the tasks.

Use Case

Figure 6 shows the major users of the system, namely the software developers, the component developers, the component library administrator, and the component users. They all have possible access to six major operations, which are LOGIN, POPULATE DATA, RUN REUSABILITY TEST, GUIDE, FEEDBACK, and EXIT.

Figure 6: Use case diagram of the proposed system

Sequence Diagram

Figure 7 shows the sequence diagram of the proposed system.

Figure 7: Sequence diagram of the proposed system

Activity Diagram

This is used to model the procedural flow of actions/events/activities that occurred in a system. It describes the use case and the sequence models. Figure 8 shows the system’s activity diagram.

Figure 8: Activity diagram depicting the proposed system

Keys: CDC: Check data compatibility, DNC: Data not compatible, DC: Data compatible, R: Report on DNC, CA: Check availability, A: Available, NA: Not available, CSS: Check submission status, SS: Submission successful, SF: Submission failed, I: Iterate? Y: Yes, N: No

SYSTEM IMPLEMENTATION

System implementation refers a system life cycle phase in which the constructed system is tested and put into operation. It is the actualization of a specified designed and modeled system.

Implementation Approach

Figure 9 is the adapted agile (feature driven) development model.

Figure 9: Adapted FDD model

Implementation Flow

Figure 10 shows the flow diagram of the system implementation pattern (adaptive neuro-fuzzy inference system and genetic-fuzzy system).

Figure 10: Implementation flow

Algorithm (ANFIS)

Select Loader
If loader = ANFIS, load cipus-run.m
browse to retrieve training data
load training data
if fileext = ‘*.csv’, ‘load successful’
else ‘load unsuccessful’, reload
endif
browse to retrieve testing data
load training data
if fileext = ‘*.csv’, ‘load successful’
else ‘load unsuccessful’, reload
endif
End Select
RUN Reusability R MSE
VIEW Reusability RMSE

Algorithm (GFS)

Select Loader
If loader = GFS, load myga.m
load fuzzy-excel formatted file (loaddata.m)
if fileext = ‘*.csv’, ‘load successful’
else ‘load unsuccessful’, reload
endif
load/call/invoke ga_fitfunc.m
if load_status = ‘correct’, proceed
else re-load/re-call/re-invoke fitness function
endif
End Select
RUN Reusability RMSE
VIEW Reusability RMSE

EXPERIMENTAL EVALUATION

The FIS Properties

Table 3 presents the details/structure of the FIS design properties.

Table 3: FIS structure/properties

The ANFIS Evaluation Parameters

Table 4 shows the specifications of the ANFIS evaluation parameters.

Table 4: Adaptive neuro-fuzzy inference system specifications

The GA Optimization Parameters and Algorithm

Table 5 shows the specifications of the parameters used for the GA.

Table 5: GA specifications

Statistical Representation and Comparative Analysis

Table 6 shows the RMSE values of the two approaches (ANFIS and GFS) for the selected components.

Table 6: Components’ RMSE values for ANFIS and GFS

Figure 11 represents the comparative chart for the ANFIS and GFSs RMSE in which GFS proved to have lower RMSEs (0.0019), implying better predictor.

Figure 11: Adaptive neuro-fuzzy inference system and genetic-fuzzy system root-mean-square error

Table 7 shows the aggregate values of Table 6 for the three components selected and for the five quality factors in use.

Table 7: Computed aggregate values of component types

Analyzing with SPSS and using ANOVA (analysis of variance), the result is shown in Table 8.

From Table 8, java components proved more reusable as it recorded the least standard error (0.07935) compare to. Net component’s 0.26680 and web component’s 0.30975. Figure 12 shows the reusability prediction level of the various software components used.

Table 8: ANOVA analysis of component types’ aggregated values

Figure 12: Components’ reusability prediction level

FINDINGS

The followings are the findings from the study:

The results of the findings show that GFS with an RMSE of 0.0019 provides better reusability prediction accuracy compare to ANFIS with an RMSE of 0.1480.
The experiments conducted showed that Java components, with an S.E. of 0.07935 proved more reusable compare to web component’s S.E. of 0.30975 and.Net component’s S.E. of 0.26680.

CONTRIBUTIONS TO KNOWLEDGE

The study established:

A GFS for the evaluation of software component reusability, with the results proving the new system a better predictor than the most commonly used system (ANFIS).
Stability (in the context of volatility) as a factor that also determines reusability. This study has been able to prove that asides the commonly deployed attributes such as customizability, interface complexity, portability, and understandability (documentability), stability is a factor worthy of consideration while measuring reusability.
Software component assessment with other component types other than Java components. With researches showing that most studies on reusability of software components were done experimenting only with Java components, this study was able to carry out its assessment of component reusability using Java, web, and. Net components. The research took a leap to evaluate the level of reusability of each component, with Java components proving more reusable than the rest two component types. The study, therefore, contributed to the increasing body of knowledge that Java components are more reusable than other component types.

CONCLUSION

The essentiality of software component reusability no doubt aids software development cost and time, however, of greater necessity is the issue of measuring to ascertain the level of reusability of the selected software components for reusability. This, many researchers agreed with and deployed different evaluation techniques in assessing the level of reusability of software components.

Consequently, this work presented an evaluation of software components reusability using GFS. The study utilized five quality factors in measuring the reusability of 69 software components. The metric values for the selected five quality factors were computed using the data extracted from the components used. The design and detail analysis of the proposed system were elaborated upon. The system structure and the visible activities that take place within the system were also presented using appropriate UML design. For the implementation, GFS was developed and deployed using MATLAB as the software tool.

The result of the evaluation shows that GFS predicts more accurately with an RMSE of 0.0019 as against the commonly used method, ANFIS, with an RMSE of 0.1480, adjudging GFS as a better predictor.

DIRECTION FOR FURTHER STUDIES

The designed architecture presented in this study is simplified such that it can easily be modified to enable adaptation and application to other research domains such as monitoring system, decision support system, data mining system, and control system. The hybridized power of the system can also be extended to solve other related and more advanced intelligent applications.

Five quality factors were used in the determination of the reusability of the selected components, other quality factors as related to software components (e.g., operability, statelessness, etc.) can also be considered in future research work in the prediction of software component reusability.

Evaluating software components reusability using genetic-fuzzy soft computing approach

O. Ajayi Olusola*