The introduction and evaluation of object orientation in
a company developing real-time embedded systems.
Key words and areas:
Contributors:
Professor Colin J Theaker
Email: cjt@terrafix.co.uk
Neil Blackwood
Email: nb@terrafix.co.uk
Dr Robert Mason
Email: rjm@terrafix.co.uk
Terrafix Limited
23C Newfield Industrial Estate
High Street
Tunstall
Stoke-on-Trent
Staffordshire
ST6 5PD
England
Tel: +44 1782 577015
Fax: +44 1782 835667
The introduction and evaluation of object orientation in
a company developing real-time embedded systems.
Professor Colin J Theaker, Neil Blackwood, Dr Robert Mason
Terrafix Limited, Stoke-on-Trent, England
Abstract
This paper considers the practical experiences of a commercial company when undertaking the move to an object oriented paradigm, and the impact that the paradigm shift has entailed, both in terms of the product quality, and the process for software development.
The context for the work is outlined, in particular identifying the demanding nature of the companys product development. A significant aspect of the move to object orientation was the selection of appropriate technologies and tools to support the development, and the adaptation of the toolsets to suit the company context. A rigorous evaluation of the move was undertaken as part of an ESSI Process Improvement Experiment, and the experiences of collecting process and product metrics are described.
Key words
Objected oriented system development techniques, Java, UML, Configuration Management, Metrics, Evaluation of tools.
Terrafix is a UK company that produces leading edge command and control systems, specifically involving vehicle location, data communication and control room management. The main application areas are for the emergency services and other organisations that require a command and control capability, including facilities for mapping and vehicle tracking. These systems are highly software dependent, with tight constraints that involve complex real-time, multi-tasking, distributed and communications intensive requirements spanning diverse platforms.
The customer base is also distinctive, and this impacts on the specific functionality of the systems, which are tailored to the individual user requirements, and the needs to support thousands of individual mobile/portable units. This has obvious implications on the maintainability of the system components, and particularly on the ability to upgrade as new technologies are introduced. It has also been noted that clients often request software related changes to the functionality at short notice.
To meet these market needs, the company must be responsive to this very specialised market, and consequently the ease of software production and change is of significant importance to the business. At the same time, high quality, reliability and performance in the most cost-effective way are all expected, particularly as many of the systems are safety-critical, and system failures in areas such as the ambulance services have a very high visibility [London Ambulance Service, in Flowers 1996]. This imposes particular constraints on the quality of the delivered product, and the needs of the company to provide systems of auditable quality.
Historically, the software development process at Terrafix has been defined and managed by written procedures, which the software engineers are required to observe. Quality assurance is also very paper based and manual. It is obviously in the interests of the company to move towards more automated forms of quality management.
Most existing code is written in C and has been incrementally modified for over a decade. This has resulted in highly functional but difficult to maintain and modify software modules. Portability across platforms is also a problem. The company has recognised the need for a radical change in its software approach and to this end, the company decided to review its software engineering processes, and to adopt object oriented techniques within its development programmes.
It was envisaged that a move to an object oriented paradigm for both design capture and development would have an impact in terms of both the development process (particularly with respect to module reuse and distribution) and to the products themselves. New design techniques and languages would have to be introduced, and this would entail staff training and also a learning curve as the software engineers gained experience in the techniques.
The choice of precise language and design notations was relatively straightforward. Java was seen as the most attractive language, particularly as many of the company products are based on mobile computing applications and platforms. The rationale was that Java is designed to be modular (due to its enforced object oriented structure), multi-tasking (due to its user controllable multi-threading capability), platform independent (due to a fixed strict binary interface and virtual machine approach) and tightly structured (being defined in such a way that some of the vagaries of C and C++ are not allowed). It can be compiled onto diverse platforms for speed advantages, and processors directly running Java byte-code are available and could be incorporated into the company products. It is becoming the de-facto standard for mobile and SMART card applications, and most importantly, provides the capability to integrate simply with communications networks. The potential for code reuse between the mobile systems and the workstation-based command and control stations was of particularly interest and is one of the aspects of the paradigm change that is being measured.
In terms of the design notations, the scope of the techniques in UML [Rumbaugh, Jacobson and Booch 1999] was attractive, particularly as it includes mechanisms such as State Transition Diagrams and Sequence Diagrams, which are of particular interest for real-time software development. Extensions to provide timing behaviour, as described in [Douglass 1998] were also positive points. The greatest weakness was seen as the development process itself, and many of the techniques appeared disjoint and created the impression that the process lacked cohesion. Better documented processes are now starting to emerge [Jacobson, Booch and Rumbaugh 1999]. A further strong point in support of UML was the availability of industry strength development tools at affordable prices. The Artisan toolset [Artisan 1999] is now being used within the company for the design process.
Rather than just introducing the new technology and effectively following the hype that has accompanied object orientation, it was decided that the migration to the new paradigms should essentially be treated as a tightly controlled experiment for which the impact could be measured. As the company falls within the SME category (approx. 40 employees), a rigorous approach to this was feasible.
Two further pieces of the jigsaw puzzle needed to be in place. The first was related to the quality assurance processes and the reliance on manual and paper-based techniques. This was addressed by the introduction of a configuration management system. After evaluation of a number of systems, the one chosen was Perforce [Perforce 1999]. The second issue was concerned with how the measurement process would be addressed, and in particular, what would be measured and what tools would be used. This is considered within the later sections of this paper.
The evaluation process had to address the two dimensions that would result from the paradigm shift: firstly that the 'quality' of the product was expected to change with the introduction of object orientation. Secondly the productivity of the staff would change, which would in turn have a longer-term impact on their cost effectiveness. This required that the staff involved in the development would have to assimilate object-oriented concepts and learn the new languages and tools used in the software development.
Specific objectives were therefore identified:
From a Company perspective, these factors affect both the technical performance of the software (a, b and c above) and business performance in terms of product costs, time to market and market share (a, b, c and d above). The results would determine if the measurable gains in better software structure arising from the use of object orientation do indeed also have an impact on development costs and in the longer term such aspects as reliability and maintainability. In addition, it was anticipated that issues regarding the performance of Java in an intensive real-time and mobile environment would be highlighted.
A set of measures were identified that would contribute to an understanding of four important software engineering characteristics of the software. These are:
Structural. This is sometimes referred to as Structural Complexity. Although the primary traditional use of such metrics is in cost estimation, such as within Putnam's Model [Putnam 1978], COCOMO [Boehm 1981], Jensen's Model [Jensen 1984] and COPMO [Conte 1986], there is also a quality dimension to this, as brought out in [Pfleeger 1991].
Module Complexity. This is generally accepted as a major factor in software quality, as it affects the testability and overall manageability of the software components. In contrast with the structural complexity, which is primarily concerned with the external relationships of components, the module complexity focuses on the internal characteristics of the software.
The Cyclomatic Complexity metric proposed by McCabe [McCabe 1976] is the de-facto standard, and is one of the few metrics for which accepted norms are published (< 10 is considered good).
Cohesion. This property is concerned with the functional relatedness of software units. It is acknowledged that software may have increasing strengths of cohesiveness (seven categories have been identified in [Constantine and Yourdon 1979] (and reproduced in many software engineering texts) namely increasing from coincidental to logical, temporal, procedural, communicational, sequential and finally functional). It is generally accepted that high cohesiveness is a desirable property.
Coupling. This is a measure of interconnection amongst modules. Loose coupling is generally regarded as a desirable property. Again subcategories of coupling have been identified, covering content, common data, control, and stamp/data coupling. These are described in [Pfleeger 1991].
The monitoring is taking place over a period of approximately 18 months. During this time, a new baseline product for the company is being designed and implemented. This involves the development of an advanced command, control and communication system, addressing the mobile/portable element used for vehicle and person location, data messaging and database functions. Metrics are being collected at all stages of this development. The measures have also been applied prior to the commencement of the baseline project to existing company products with similar functionality, thereby providing a reference datum. This has led to an experimental structure, as illustrated in the following figure:
Figure 1 Experimental Structure
The experimental process therefore includes both a comparison with a 'control' (Reference Data Set) and also monitoring of the development over time.
There were a number of practical considerations that potentially could have had a significant impact. In particular, the introduction of a metrics programme accompanying the paradigm shift could have been viewed with scepticism by the personnel in the company. However, as the ethos within the company was that the programme was aimed at process and product improvement, rather than culpability, this turned out not to be an issue.
As probably the most significant impact of the paradigm shift is on the developers themselves, staff training would play an increasingly important role as product development takes place. This provides an opportunity to monitor the impact on the developers of experience gained at different stages of the development. That is, one would expect observable differences in 'quality' in components developed immediately after initial training to those produced after, say, 12 months experience.
As the new baseline product is being generated using UML for design notation and Java as the implementation language, it was decided that metrics will be collected, from the outset, for both the design and implementation processes in as automated a way as possible. Transparency of data collection was considered important to ensure that the measurement process did not interfere with the development activity, thereby perturbing any experimental results.
The development of the baseline product would be performed under the strict regime of the configuration management system identified earlier. This would be used for all aspects of the new product (documentation, design, implementation, etc.). As high integrity of the Companys products is a major factor, the traceability of the design and implementation is of high priority from a commercial and marketing point of view. However the benefit of this approach was that it provided an opportunity for configuring and extending the chosen suite of configuration management software to enable the collection of metrics on the development process. The primary extensions were to date-stamp all system developments, complete with a (constrained) rationale of the nature of all changes and an identification of the development staff involved.
There is an obvious danger, of which the authors are very aware, of trying to compare very different systems using metrics, and ultimately arriving at a conclusion that chalk is different to cheese, or that apples are different to pears. There is inevitably a problem when comparing software developed with two very different paradigms and languages, in that metrics appropriate for one paradigm are inappropriate for another. This is the case when comparing the Reference Data Set software with the new Baseline System being developed in Java.
As one objective was to identify the characteristics of the software under the broad categories of structural complexity, module complexity, cohesion and coupling, it was decided that metrics and metric collection tools should be identified that could give indicative function or method measures for all the sample code (C or Java), and class oriented metrics that could be applied to the Java code. The rationale and justification for these is presented elsewhere [PIOJAVA 1999].
The metrics chosen for functions or methods were:
Structural complexity
LOC Lines Of Code per Method;
EXEC Number of executable statements;
Module complexity
V(G) Cyclomatic Complexity;
OC Operational Complexity;
CONTROL Number of control statements;
BRANCH Number of Branching Nodes;
NEST Maximum Number of Levels;
Cohesion
NION Number of Input / Output Nodes;
CALLS Number of Calls;
Coupling
CALLS Number of Calls;
NP Number of Parameters per Method;
It is recognised that some metrics appear under more than one category. Detailed descriptions of these appear as an appendix to this paper.
The class oriented metrics were identified as:
Structural complexity
LOC Lines Of Code comprising a Class;
CSA Class Size in Attributes;
NOCC Number Of Child Classes;
DIT Depth of Class Inheritance Tree;
NOAC Number of Operations Added by a Class;
Module complexity
CSO Class Size in Operations (Methods);
WMC Weighted Methods per Class;
Cohesion
NOCC Number Of Child Classes;
LOCM Lack Of Cohesion of Methods;
NOOC Number of Operations Overridden by a Class;
Coupling
PPPC Percentage of Package, Public and Protected members in a Class;
CBO Coupling Between Objects;
PA Public Accessors;
A number of tools from different tool suppliers were assessed for their suitability for this task. Considerable variability was observed, often with different tool sets yielding widely variable results for supposedly the same measures applied to the same piece of software. Eventually a reliable, coherent and consistent tool set was identified, being based on the Krakatau metrics package produced by Powersoftware [Krakatau 1999].
As this is an on-going programme, results are emerging as the development proceeds. The current state of findings, particularly with respect to the object-oriented metrics, is a snapshot taken at the time of writing this paper (to be updated for copy).
The Reference Data Set comprises software from existing Terrafix products, including both application level software and deeply embedded software. The six software products which make up this set have been developed in the C or C++ programming languages and comprise 159 modules and over 1300 functions or methods.
The software has been implemented in a number of ways, in a mixture of formal and informal approaches. The approaches taken have been found, pragmatically, to be appropriate to the mixture of applications and interfaces being used. There is little data concerned with the design process of the reference software, as it has evolved over many years. Measurement of appropriate metrics for the design, while not impossible, is difficult to carry out consistently on non-computer-system-based data. Consequently the main metrics to be derived from the reference data relate to the implementation of the software components.
Although the main role of the Reference Data Set is to provide a benchmark for comparison, an initial analysis of the software characteristics has been performed in order to identify whether the different categories of software (embedded or workstation based) exhibit distinctive characteristics. It is a hypothesis, drawn from the philosophy that underpins languages and support environments, such as for Java, that such distinctions should no longer exist within modern software system environments. This dimension is being investigated as part of the experimental evaluation.
The first four products contain embedded code; whilst the remaining two are non-embedded applications. Table 1 summarises the basic content of these software products.
| OPERATING SYSTEM | LANGUAGE | NUMBER OF SOURCE FILES | NUMBER OF METHODS | |
| EMBEDDED | ||||
| IIU | RTOS | C | 20 |
197 |
| MIP | RTOS | C | 10 |
63 |
| CIU-CHANNEL | RTOS | C | 6 |
34 |
| CIU-CONTROL | RTOS | C | 8 |
78 |
| NON-EMBEDDED | ||||
| DGPSSERVER | Windows NT | C/C++ | 17 |
157 |
| BASESTAT | DOS | C/C++ | 98 |
811 |
Table 1. Software products used for reference data evaluation
Software metrics were applied to this data to check the accuracy of function or method metrics and to provide reference metric measures to compare with the corresponding metrics of the Java files.
A number of different analyses were performed on the data collected, based on standard statistical calculations, using mean values of each metric on a module, on a group wide basis, and comparisons between and within module groups.
Some software modules from published Java products were also evaluated. This allowed the investigation of the application of metrics to Java based software, as a forerunner to the evaluation of the new software products. The method and class metrics applicable to object oriented code were applied to these products. These products are summarised in Table 2.
| PRODUCT | NUMBER OF SOURCE FILES | NUMBER OF METHODS | NUMBER OF CLASSES |
| JCVS | 16 |
130 |
17 |
| CVCS | 21 |
432 |
19 |
Table 2. Software products used for Java data evaluation
Further evaluation of Java software modules generated within Terrafix as part of the Baseline System development yielded the following (to be updated for publication):
| PRODUCT | NUMBER OF SOURCE FILES | NUMBER OF METHODS | NUMBER OF CLASSES |
| J-TAVLS | 6 |
75 |
7 |
Table 3. Software products developed in baseline project
For each of the 6 software products of the Reference Data Set, the ten measurements identified in section 4 have been taken. These have been aggregated according to the groups identified earlier. In all cases, a significant difference was observed between the embedded and the non-embedded aggregate groups. (It should be noted that the values for all of the metrics except CALLS and BRANCH were highly significant statistically.) Within the embedded aggregate group, variations in significance were observed at varying levels which depended on the software product and metric.
The same ten measures have also been taken for each of the 2 published Java products. These measurements from individual software products have been aggregated into the Java group, and the mean values calculated. Figure 2a shows the mean values of the method metrics for the embedded, non-embedded and Java groups, and Figure 2b shows these normalised with respect to the mean values of the embedded group; the group that appears to exhibit the dominant factors.

Figure 2a Absolute values of Method Metrics

Figure 2b Normalised Method Metrics
To evaluate differences between and within the groups, pair-wise Students "t" tests were applied in the following manner:
for each individual metric, the values were compared against the corresponding non object oriented measurements;
within the Java aggregate group, for each individual metric, the values for each of the 2 software products were compared.
The results of these comparisons are shown in Table 4 for significance levels of p < 0.05 and p < 0.1 respectively. Differences, where they are significant at these levels, are shown as "D".
| METRIC | Between C CODE and JAVA |
Between EMBEDDED and JAVA | Between NONEMBEDDED and JAVA | Within JAVA GROUP | ||||
| p<0.05 | p < 0.1 | p<0.05 | p < 0.1 | p<0.05 | p < 0.1 | p<0.05 | p < 0.1 | |
| LOC | D |
D |
D |
D |
D |
D |
N |
N |
| NP | D |
D |
N |
N |
D |
D |
D |
D |
| V(G) | D |
D |
D |
D |
D |
D |
N |
N |
| OC | D |
D |
D |
D |
D |
D |
N |
N |
| NEST | D |
D |
D |
D |
N |
D |
N |
N |
| CALLS | D |
D |
D |
D |
D |
D |
N |
N |
| BRANCH | D |
D |
D |
D |
N |
N |
N |
N |
| NION | N |
N |
D |
D |
N |
D |
N |
N |
| CONTROL | D |
D |
D |
D |
N |
N |
N |
N |
| EXEC | D |
D |
D |
D |
D |
D |
N |
N |
Table 4 Significant difference at p<0.05 and p<0.1 levels

Figure 3a Mean Values of Class Metrics
For the class metrics, 12 measures have been taken and mean values calculated. Figure 3a shows the mean values of the class metrics for the modules of the Java group, including the J-TAVLS modules. The normalised mean, shown in Figure 3b was calculated by dividing the mean values for each module by the mean values for the CVSC module (largest).

Figure 3b Normalised Values of Class Metrics
Although the Process Improvement Experiment is still progressing and data on the software development is being gathered, a number of conclusions can be drawn from the metrics collected so far.
In most cases, a significant difference was observed between the embedded and non-embedded aggregate groups for the Reference Data software. The structural complexity and module complexity for the embedded software was noticeably higher, and one must conclude that this relates to the programming style of the software engineers developing the embedded applications. The measures revealed little difference with respect to the cohesion and coupling, and further investigation, focussing on data coupling, is in hand.
Within the Java group, in most cases no significant difference was observed between the different software components. However, the Java software exhibited further simplicity with respect to the complexity measures over both the embedded and non-embedded Reference Data Sets. One could infer that this should lead to more reliable software systems.
The collection of metrics over a time period, using the configuration management system to monitor the software development, has yet to yield sufficient data for analysis. It is anticipated that the design and implementation styles of the software engineers would change as they become more experienced with object oriented concepts and practices. This has still to be verified. However, it has been observed that the Java software developed at the start of this project is exhibiting similar characteristics to other (off the shelf) Java code, with corresponding desirable complexity characteristics.
Acknowledgements
This work was supported by the UK Department of Trade and Industry under the auspices of the Teaching Company Directorate, via a collaborative programme in conjunction with Staffordshire University.
The evaluation work was undertaken as a European Systems and Software Initiative (ESSI), Process Improvement Experiment, No 27719 PIOJAVA.
The authors would also like to acknowledge the efforts of David Leigh in the collection and evaluation of the metrics, and of the staff at Powersoftware for assistance with the Krakatau metrics tool set.
References
Artisan 1999, http://www.artisansw.com
Boehm BW, Software Engineering Economics, Prentice-Hall 1981.
Constantine L L and Yourdon E, Structured Design, Prentice-Hall, 1979.
Conte S D, Dunsmore H E and Shen V Y, Software Engineering Metrics and Models, Benjamin-Cummings 1986.
Douglass B P, "Real-Time UML Developing Efficient Objects for Embedded Systems", Addison Wesley, Object Technology Series 1998, ISBN 0-201-32579-9.
Flowers S, "Software Failure - Management Failure", John Wiley and Sons 1996, ISBN 047195137.
Jacobson I, Booch G, Rumbaugh J, "The Unified Software Development Process", Addison Wesley, Object Technology Series 1999, ISBN 0-201-57169-2.
Jensen R W, A comparison of the Jensen and COCOMO schedule and cost estimation models, Proceedings International Society of Parametric Analysis, 1984.
Krakatau 1999, http://www.powersoftware.com
McCabe T, A Software Complexity Measure, IEEE Transactions on Software Engineering Vol 2, No 12, 1976.
Perforce 1999, http://www.perforce.com
Pfleeger S L, Software Engineering: The production of quality Software, Macmillan 1991.
PIOJAVA 1999, PIOJAVA Experimental Plan, ESSI project report, 1999.
Putnam L H, A General Empirical Solution to the Macro Software Sizing and Estimating Problem, IEEE Transactions on Software Engineering, Vol 4, No 4, 1978.
Rumbaugh J, Jacobson I, Booch G, "The Unified Modeling Language Reference Manual", Addison Wesley, Object Technology Series 1999, ISBN 0-201-30998-X.
Appendices
Method Metrics from the Powersoftware Krakatau Package
LOC Method Lines of Code
LOC is a primitive metric to measure the size of a method.
EXEC Number of Executable
Statements
This is a measure of the number of executable statements in a method or function.
NEST Maximum Number of
Levels
Cognitive sciences have shown that groups that contain more than seven pieces of
information are increasingly harder for people to understand in problem solving. To
measure this, the number of IF
THEN or IF
THEN
ELSEs in a nest are counted.
Logical units with a large number of nested levels may need implementation simplification
and process improvement.
V(G) Cyclomatic Complexity
Cyclomatic Complexity measures the number of possible paths through an algorithm by
counting the number of distinct regions on a flowgraph. This represents the cognitive
complexity of the method.
OC Operational Complexity
This metric assigns weights to operations which can occur in expressions. The values of
the weights for all the expressions in a method are summed to provide a value for OC. This
is complementary to V(G) since it looks at the complexity of the expressions which are
being evaluated rather than the number of decision points in the method.
CONTROL Number of Control
Statement
This is a measure of the number of control statements (selection, iteration) in a method
or function.
BRANCH Number of Branching
Nodes
Higher values indicate possible use of GOTOs and / or abnormal exits from control
structures such as loops. This is an indicator of unstructured design and increases the
testing difficulty. However, it may also indicate exception handling conditions.
NION Number of Input /
Output Nodes
NION is a measure of the number of input / output nodes in a given method / function.
Programming practices today state that there should be one way into a module and one way
out. This measures the difficulty of testing the control logic of software. Logical units
with a large number of input / output nodes may need implementation simplification and
process improvement.
CALLS Number of Calls
CALLS is the number of calls from a method or function to subordinate logical units
(methods or functions). This is a measure of the degree of FAN-OUT.
NP - Number of Parameters
NP is simply a measure of the number of parameters that a method accepts. High values for
NP can mean that a method will require extensive testing (since the range of possible
inputs may be greater). As a rule of thumb, methods with many parameters also tend to be
more specialised and so are less likely to be reusable.
Class Metrics from the Powersoftware Krakatau Package
The following metrics are collected by the Krakatau software, and provide a method for interpreting the object oriented properties of the software.
CSA Class Size in Attributes
CSA measures class size by counting the number of attributes of a class (not including
inherited attributes).
CSO Class Size in Operations
CSO measures class size by counting the number of operations (methods) in a class (not
including inherited methods).
NOCC Number of Child Classes
This metric counts the number of classes which inherit from a particular class.
PPP Percentage of Package, Public and Protected members
in a Class
Members which have package level protection are visible to other classes in the same
package. Public members are available to classes in all packages and protected members are
available to subclasses.
DIT Depth in Class Inheritance Tree
This metric reports how deeply a class resides in the class inheritance tree.
CBO Coupling between Object Classes
The value for CBO is the number of classes to which a particular class refers. References
can be uses of classes as member types, parameter types, method local variable types or
casts.
LOCM Lack of Cohesion of Methods
LOCM measures the percentage of methods that do not access a specific attribute averaged
over all attributes in the class.
NOOC Number of Operations Overridden by a Class
NOOC measures the number of inherited operations which a class overrides. High values for
NOOC tend to indicate design problems; subclasses should generally add to and extend the
functionality of the parent classes rather than overriding them.
NOAC Number of Operations Added by a Class
This metric measures the number of operations added by a class.
WMC Weighted Methods per Class
WMC is a count of the methods in a class (weighted according to complexity).
PA Public Accessors
This measures class coupling as the number of other classes which access / use this class.