ONTOCOM LITE Cost Drivers

Similar to the cost drivers of the original ONTOCOM model, the cost drivers for ONTOCOM LITE are separated into several groups. Most modifications in ONTOCOM LITE can be found in the product cost drivers: Some of the original cost drivers have been adapted based on the complexity of the taxonomy development task and some have been removed due to their non-applicability for taxonomy development.

1. Product Factors

Domain Complexity/Requirements Complexity/Information Soures: DCPLX/RCPLX/IS

The domain complexity driver combines three factors: the complexity of the domain from which the taxonomy is built, the number and complexity of the requirements and the availability of information sources. As is the case with ontologies, the domain in which the taxonomy is built can be wide or narrow and might require expert knowledge or just common-sense knowledge. The requirements in taxonomy building can cover several aspects, such as design and technical aspects which should cover the issues related with determining the scope, the purpose and content object of the taxonomy, and user related
requirements. We also account for the impact which the availability of information sources can bring to the
taxonomy development process. The rating scales for the domain complexity, requirements and information sources are given in the following tables:

Cost Driver CCPLX
Rating Rating Scale
Very High wide scope, expert knowledge, high connectivity
High moderate to wide scope, common-sense or expert knowledge, high connectivity
Nominal moderate to wide scope, common-sense or expert knowledge, moderate connectivity
Low narrow to moderate scope, common-sense or expert knowledge, low connectivity
Very Low narrow scope, common-sense knowledge, low connectivity
Domain complexity (DCPLX)
Cost Driver CCPLX
Rating Rating Scale
Very High very high number of req. with a high conflicting degree, high number of usability requirements
High high number of usability requirements, few conflicting requirements
Nominal moderate number of requirements, with few conflicts, few usability requirements
Low small number of non-conflicting requirements
Very Low few simple requirements
Requirements complexity (RCPLX)
Cost Driver CCPLX
Rating Rating Scale
Very High high number of information sources and structured data, high use of a generic taxonomy
High high number of information sources, some modifications to a generic taxonomy
Nominal good quality and number of information sources
Low some information sources availability
Very Low none
Information Sources (IS)

 


Classification Complexity: CCPLX

The Classification Complexity (CCPLX) cost driver measures the effort associated with establishing the hierarchical relationships between the concepts. The classification complexity is a separate cost driver from the Concept Derivation Complexity (CDCPLX) cost driver, so our model could include simpler structures that do not have hierarchical relationships such as lists. In such a case this cost driver should be discarded.

Cost Driver CCPLX
Rating Rating Scale
Very High Difficulty in establishing hierarchical relationships for almost every concept, high number of multi-inheritance relationships
High High difficulty in establishing hierarchical relationships, some number of multi-inheritance relationships
Nominal Moderate difficulty in establishing hierarchical relationships
Low Some cases of diffficulty in establishing hierarchical relationships
Very Low noneNearly no difficulty in establishing hierarchical relationships
Classification complexity (CCPLX)

 


Concept Derivation Complexity: CDCPLX

As we saw in the previous section one of the biggest challenges in taxonomy development is to derive the concepts for the taxonomy. The Concept derivation complexity cost driver CDCPLX accounts for the impact problems such as synonyms and ambiguity have on the process of deriving the concepts. In addition it also factors in any machine assistance associated with this stage of the taxonomy development.

Cost Driver CCPLX
Rating Rating Scale
Very High Mostly manual work, very high number of synonyms, ambiguity cases
High High amount of manual work, some automatic processing, high level of synonyms, ambiguity cases
Nominal Manual work done in combination with some automatic processing, some instances of synonym and ambiguity cases
Low Some manual work, high use of automatic processing, rare instances of synonym and ambiguity cases
Very Low A few manual steps mostly reviewing, high use of automatic processing, nearly no instances of synonym and ambiguity cases
Concept derivation complexity (CDCPLX)

 


Documentation Needs: DOCU

Similar to ontology development and all other complex engineering processes, additional costs may arise as a consequence of documentation requirements during the life cycle (LC) process of development.

Cost Driver CCPLX
Rating Rating Scale
Very High comprehensive documentation for every stage in the LC process
High comprehensive documentation only for some stages in the LC process
Nominal right-sized documentation for every stage in the LC process
Low some stages omitted from documentation needs
Very Low many stages omitted from documentation needs
Documentation needs (DOCU)

 


Classification of Data: CDATA

Classification of data in a taxonomy involves mapping the content to the concepts of the taxonomy. This means that the content somehow be has to be mapped using either some language, schema or database. The effort associated with this cost driver corresponds to the degree of automation that is possible.

Cost Driver CCPLX
Rating Rating Scale
Very High unstructured data in natural langauge, free form, manual mapping required, no automation
High semi-structured data in natural language, e. g. similar web pages, some automation available
Nominal semi-structured, automation available however needs manual reviewing
Low structured data, high degree of automation, manual intervention minimal
Very Low structured data, process completely automated
Classification of Data (CDATA)

 


Taxonomy Evaluation: TE

Taxonomy evaluation can undergo extensive reviewing with domain experts and user as well as testing. To determine the effort associated with evaluation, the TE cost driver tries to scale the scope and intensity of the evaluation process.

Cost Driver CCPLX
Rating Rating Scale
Very High Extensive reviews and testing
High Considerable number of reviews and testing
Nominal Moderate number of reviews and testing
Low Small and low number of reviews and tests
Very Low Almost no reviews and tests
Taxonomy Evaluation (TE)

 


Taxonomy Maintenance: TM

The Taxonomy Maintenance cost driver TM tries to determine the impact on effort on of the modifications during reviews and the number of new concepts and data added.

Cost Driver CCPLX
Rating Rating Scale
Very High Substantial reorganization of the hierarchy and reclassification of data
High Significant reorganization of the hierarchy and reclassification of data
Nominal Some reorganization of the hierarchy and reclassification of data
Low Minor reorganization of the hierarchy and reclassification of data
Very Low No reorganization of the hierarchy and reclassification of data
Taxonomy Maintenance (TM)

 

2. Personnel Factors

Taxonomy / Domain Expert Capability: TECAP/DCAP

Similar to the ontology development process, the taxonomy development process will also require the collaboration of personnel with background in building taxonomies and domain experts and stakeholders which posses the knowledge about the domain. Like ONTOCOM the cost drivers account for the perceived ability of the actors involved in the process as well as their performance as a team.

Cost Driver CCPLX
Rating TECAP/DCAP
Very High 90%
High 75%
Nominal 55%
Low 35%
Very Low 15%
Capability Ratings of the Engineering Team (TECAP/DCAP)

 


Taxonomy / Domain Expert Experience: TXEXP / DEEXP

These factors try to measure the experience of the two teams as a whole in their respective areas, taxonomy development and domain conceptualization. Like the respective cost driver in ONTOCOM, this cost driver tries to capture the experience of the taxonomy developers with the tools at their disposal and/or language e.g. markup languages such as XML or languages such as OWL. This also includes knowledge of any representation languages used during the processes of identifying and classifying concepts.

Cost Driver CCPLX
Rating TXEXP/DEXP
Very High 7 years
High 5 years
Nominal 3 years
Low 1 year
Very Low 6 months
Experience Ratings for the Team (TXEXP/DEXP)

 


Personnel Continuity: PCON

The personnel continuity cost driver tries to factor in the changes in personnel during the process life-cycle in a time and resource constrained project. Since we believe that the size of the engineering team in ontology development and taxonomy development are likely the same we use the same values proposed in the original ONTOCOM model.

Cost Driver CCPLX
Rating PCON
Very High 10%
High 15%
Nominal 25%
Low 35%
Very Low 50%
Personnel Continuity (PCON)

 


Language and Tool Experience: TEXP / LEXP

Like ONTOCOM, this cost driver tries to evaluate the experience of the taxonomy developers with the tools at their disposal and/or language e.g. markup languages like XML or languages like OWL. This also includes knowledge of any representation languages used during the processes of identifying and classifying concepts.

Cost Driver CCPLX
Rating TEXP/LEXP
Very High 3 years
High 2 years
Nominal 1 years
Low 6 months
Very Low 2 months
Tool and Language Experience (TEXP/LEXP)

 

3. Project Factors

Support Tools for Taxonomy Development: TOOL

Support tools for taxonomy development during all stages of the development life-cycle surely have a great impact on effort. While we expect that most taxonomy development team have a variaty of tools at their disposal we would like to account for their effectiveness in all the stages of development. A set of tools can include tools for design, implementation and mapping of data to the taxonomy. It can also include tools which help in concept identification.

Cost Driver CCPLX
Rating Rating Scale
Very High High quality and availability of tools, manual intervention minimal
High Few manual processing required
Nominal Basic manual intervention needed
Low Some tool support
Very Low Minimal tool support, mostly manual processing
Tool Support (TOOL)

 


Multi-site Development: SITE

Like in ontology development, taxonomy development might require extensive communication between the various parties. The SITE cost driver assesses the communication support tools.

Cost Driver CCPLX
Rating Rating Scale
Very High frequent F2F meetings
High teleconference, occasional meetings
Nominal email
Low Some tool supportphone, fax
Very Low mail
Multisite Development (SITE)

 


Required Development Schedule: SCED

The SCED cost driver accounts for the impact of schedule constraints on the taxonomy development process. Processes which have tight schedule (under 100%) constraints tend to produce more effort in the later stages of the taxonomy development lifecycle like refinement and evolution. Stretched-out schedule usually produce more e ort early on like in the identification of concept stage.

Cost Driver CCPLX
Rating SCED
Very High 160%
High 130%
Nominal 100%
Low 85%
Very Low 75%
Required Development Schedule (SCED)

 

4. Reuse Factors

Taxonomy Understanding: TU

This factor takes into account the level of ease with which the taxonomy to be reused can be understood. The rating scales for taxonomy understanding are given in the following table:

Cost Driver CCPLX
Rating Rating Scale
Very High Representation language tool, 90% comments in natural language
High Representation language tool, 60% comments in natural language
Nominal Representation language tool, 30% comments in natural language
Low Representation language know-how, few comments in natural language
Very Low Representation language know-how, no comments in natural language
Taxonomy Understanding (TU)

 


Taxonomy Developer and Domain Expert Unfamiliarity with Reused Taxonomy: UNFM

This factor increments the impact of the TU cost driver and expresses the level of unfamiliarity of the engineering team w.r.t. the taxonomy to be reused (Did the team build it themselves? Does the team work with it every day? Does the team have only little experience with this taxonomy?). The rating scales for UNFM are given in the following table:

Cost Driver CCPLX
Rating Rating Scale
1.0 Completely unfamiliar
0.8 Little experience
0.6 Occasional usage
0.4 Every day usage
0.2 Team built
0.0 Self built
TU Increment for Unfamilarity (UNFM)

 


Taxonomy modification: TMD

This measure reflects the complexity of the modifications required by the reuse process after the evaluation phase has been completed. The rating scales for TMD are given in the following table:

Rating Rating Scale
Very High Excessive modifications
High Considerable modifications
Nominal Some, moderate modifications
Low Some simple modification
Very Low Few, simple modifications
Taxonomy Modification (TMD)

 


Taxonomy Translation: TT

This factor reflects the impact of translating taxonomies between knowledge representation languages. The rating scales for the TT driver are given in the following table:

Rating Rating Scale
Very High Manual effort
High Considerable manual effort
Nominal Some manual effort
Low Low manual effort
Very Low Direct
Taxonomy Translation (TT)

 

5. Usage Factors

Deployment type: DEPLOY

This factor reflects the effort to deploy a new taxonomy version to applications which use it. The rating scales for the deployment type driver are given in the following table:

Rating Rating Scale
Very High Excessive changes in the taxonomy and applications are interconnected.
High Excessive changes in the taxonomy and applications are independent.
Nominal Moderate changes in the taxonomy and applications interconnectionis irrelevant.
Low Few changes in the taxonomy and applications are interconnected.
Very Low Few changes in the taxonomy and applications are independent.
Deployment type (DEPLOY)

 


Complexity of the Software Adaptation: SADAP

This measure expresses the adaptation that needs to be conducted on the existing corporate software after an taxonomy update and deployment. The rating scales for the SADAP driver are given in the following table:

Rating Rating Scale
Very High Existing software must be replaced.
High Existing software must be exhaustedly adapted to ensure it confirms the requirements of the current taxonomy version.
Nominal Existing software must be slightly adapted to ensure it confirms to the requirements of the current taxonomy version.
Low Existing software must be configured towards the current taxonomy version (e.g. changed path or URI).
Very Low No software adaptation is needed.
Complexity of the Software Adaptation (SADAP)

 


Visibility to the User: VISIBIL

This measure expresses the visibility of the taxonomy to the user and costs connected with the user training. The rating scales for the VISIBIL driver are given in the following table:

Rating Rating Scale
Very High Regular training with well prepared training material is necessary.
High More intensive training with some training material is necessary.
Nominal Brief training is sucient.
Low Some basic documents as assistance are sucient.
Very Low No training is needed.
Visibility of the Taxonomy in Use for the User (VISIBIL)

 


Complexity of the Instantiation: UDATA

This driver addresses the instance generation during the usage of the taxonomy by employees in the target applications as part of their work. The rating scales for the UDATA driver are given in the following table:

Rating Rating Scale
Very High Manual instantiation as additional work, users work directly with the taxonomy and need knowledge about the taxonomy structure.
High Manual instantiation as additional work, users need basic knowledge about taxonomies and semantics (e.g. use of semantic wikis).
Nominal Manual instantiation as part of common workflows.
Low Semi-automatic instantiation, users do not need any knowledge about taxonomies.
Very Low Automatic instantiation.
Complexity of the Instantiation (UDATA)

 


Complexity of the user feedback: UFEED

This measure describes the effort that taxonomy users need to invest to return feedback about a taxonomy. The rating scales for the UFEED driver are given in the following table:

Rating Rating Scale
Very High Implicit feedback and explicit feedback (oral & written) via questionnaires, regular reports, and dialog (F2F) with taxonomy developer.
High Implicit feedback and explicit feedback via questionnaires & regular reports.
Nominal Implicit feedback and explicit feedback via questionnaires.
Low Implicit feedback and explicit feedback via brief comments.
Very Low Implicit feedback.
Complexity of the User Feedback (UFEED)

 


Complexity of the Software Engineer Feedback: SEFEED

This measure describes the effort that taxonomy software developers need to invest to return feedback about a taxonomy. The rating scales for the SEFEED driver are given in the following table:

Rating Rating Scale
Very High Implicit feedback and explicit feedback (oral & written) via questionnaires, regular reports, and dialog (F2F) with taxonomy developer.
High Implicit feedback and explicit feedback via questionnaires & regular reports.
Nominal Implicit feedback and explicit feedback via questionnaires.
Low Implicit feedback and explicit feedback via brief comments.
Very Low Implicit feedback.
Complexity of the Software Engineer Feedback (SEFEED)

 


Complexity of the Reporting: REP

This driver describes the effort in generating reports which depends on the type of users and software developers feedback. The rating scales for the REP driver are given in the following table:

Rating Rating Scale
High Implicit feedback as well as oral and written explicit feedback has to be reported.
Low Implicit feedback and written explicit feedback has to be reported.
Very Low Implicit feedback has to be reported.
Complexity of the Reporting (REP)