ONTOCOM LITE Cost Drivers
Similar to the cost drivers of the original ONTOCOM model, the cost drivers for ONTOCOM LITE are separated
into several groups. Most modifications in ONTOCOM LITE can be found in the product cost drivers: Some of the
original cost drivers have been adapted based on the complexity of the taxonomy development task and some
have been removed due to their non-applicability for taxonomy development.
1. Product Factors
Domain Complexity/Requirements Complexity/Information Soures: DCPLX/RCPLX/IS
The domain complexity driver combines three factors: the complexity
of the domain from which the taxonomy is built, the number and complexity of the requirements and the
availability of information sources. As is the case with ontologies, the domain in which the taxonomy is built
can be wide or narrow and might require expert knowledge or just common-sense knowledge. The requirements
in taxonomy building can cover several aspects, such as design and technical aspects which should cover the
issues related with determining the scope, the purpose and content object of the taxonomy, and user related
requirements. We also account for the impact which the availability of information sources can bring to the
taxonomy development process. The rating scales for the domain complexity, requirements and information
sources are given in the following tables:
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
wide scope, expert knowledge, high connectivity |
High |
moderate to wide scope, common-sense or expert knowledge, high connectivity |
Nominal |
moderate to wide scope, common-sense or expert knowledge, moderate connectivity |
Low |
narrow to moderate scope, common-sense or expert knowledge, low connectivity |
Very Low |
narrow scope, common-sense knowledge, low connectivity |
|
Domain complexity (DCPLX) |
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
very high number of req. with a high conflicting degree, high number of usability requirements |
High |
high number of usability requirements, few conflicting requirements |
Nominal |
moderate number of requirements, with few conflicts, few usability requirements |
Low |
small number of non-conflicting requirements |
Very Low |
few simple requirements |
|
Requirements complexity (RCPLX) |
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
high number of information sources and structured data, high use of a generic taxonomy |
High |
high number of information sources, some modifications to a generic taxonomy |
Nominal |
good quality and number of information sources |
Low |
some information sources availability |
Very Low |
none |
|
Information Sources (IS) |
Classification Complexity: CCPLX
The Classification Complexity (CCPLX) cost driver measures the
effort associated with establishing the hierarchical relationships between the concepts. The classification complexity is a separate cost driver from the Concept Derivation Complexity (CDCPLX) cost driver, so our model could include simpler structures that do not have hierarchical relationships such as lists. In such a case this cost driver should be discarded.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
Difficulty in establishing hierarchical relationships for almost every concept, high number of multi-inheritance relationships |
High |
High difficulty in establishing hierarchical relationships, some number of multi-inheritance relationships |
Nominal |
Moderate difficulty in establishing hierarchical relationships |
Low |
Some cases of diffficulty in establishing hierarchical relationships |
Very Low |
noneNearly no difficulty in establishing hierarchical relationships |
|
Classification complexity (CCPLX) |
Concept Derivation Complexity: CDCPLX
As we saw in the previous section one of the biggest challenges in taxonomy development is to derive the concepts for the taxonomy. The Concept derivation complexity cost driver CDCPLX accounts for the impact problems such as synonyms and ambiguity have on the process of deriving the concepts. In addition it also factors in any machine assistance associated with this stage of the taxonomy development.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
Mostly manual work, very high number of synonyms, ambiguity cases |
High |
High amount of manual work, some automatic processing, high level of synonyms, ambiguity cases |
Nominal |
Manual work done in combination with some automatic processing, some instances of synonym and ambiguity cases |
Low |
Some manual work, high use of automatic processing, rare instances of synonym and ambiguity cases |
Very Low |
A few manual steps mostly reviewing, high use of automatic processing, nearly no instances of synonym and ambiguity cases |
|
Concept derivation complexity (CDCPLX) |
Documentation Needs: DOCU
Similar to ontology development and all other complex engineering processes, additional costs may arise as a consequence of documentation requirements during the life cycle (LC) process of development.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
comprehensive documentation for every stage in the LC process |
High |
comprehensive documentation only for some stages in the LC process |
Nominal |
right-sized documentation for every stage in the LC process |
Low |
some stages omitted from documentation needs |
Very Low |
many stages omitted from documentation needs |
|
Documentation needs (DOCU) |
Classification of Data: CDATA
Classification of data in a taxonomy involves mapping the content to the concepts of the taxonomy. This means that the content somehow be has to be mapped using either some language, schema or database. The effort associated with this cost driver corresponds to the degree of automation that is possible.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
unstructured data in natural langauge, free form, manual mapping required, no automation |
High |
semi-structured data in natural language, e. g. similar web pages, some automation available |
Nominal |
semi-structured, automation available however needs manual reviewing |
Low |
structured data, high degree of automation, manual intervention minimal |
Very Low |
structured data, process completely automated |
|
Classification of Data (CDATA) |
Taxonomy Evaluation: TE
Taxonomy evaluation can undergo extensive reviewing with domain experts and user as well as testing. To determine the effort associated with evaluation, the TE cost driver tries to scale the scope and intensity of the evaluation process.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
Extensive reviews and testing |
High |
Considerable number of reviews and testing |
Nominal |
Moderate number of reviews and testing |
Low |
Small and low number of reviews and tests |
Very Low |
Almost no reviews and tests |
|
Taxonomy Evaluation (TE) |
Taxonomy Maintenance: TM
The Taxonomy Maintenance cost driver TM tries to determine the impact on effort on of the modifications during reviews and the number of new concepts and data added.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
Substantial reorganization of the hierarchy and reclassification of data |
High |
Significant reorganization of the hierarchy and reclassification of data |
Nominal |
Some reorganization of the hierarchy and reclassification of data |
Low |
Minor reorganization of the hierarchy and reclassification of data |
Very Low |
No reorganization of the hierarchy and reclassification of data |
|
Taxonomy Maintenance (TM) |
2. Personnel Factors
Taxonomy / Domain Expert Capability: TECAP/DCAP
Similar to the ontology development process, the taxonomy development process will also require the collaboration of personnel with background in building taxonomies and domain experts and stakeholders which posses the knowledge about the domain. Like ONTOCOM the cost drivers account for the perceived ability of the actors involved in the process as well as their performance as a team.
Cost Driver CCPLX |
Rating |
TECAP/DCAP |
Very High |
90% |
High |
75% |
Nominal |
55% |
Low |
35% |
Very Low |
15% |
|
Capability Ratings of the Engineering Team (TECAP/DCAP) |
Taxonomy / Domain Expert Experience: TXEXP / DEEXP
These factors try to measure the experience of the two teams as a whole in their respective areas, taxonomy development and domain conceptualization. Like the respective cost driver in ONTOCOM, this cost driver tries to capture the experience of the taxonomy developers with the tools at their disposal and/or language e.g. markup languages such as XML or languages such as OWL. This also includes knowledge of any representation languages used during the processes of identifying and classifying concepts.
Cost Driver CCPLX |
Rating |
TXEXP/DEXP |
Very High |
7 years |
High |
5 years |
Nominal |
3 years |
Low |
1 year |
Very Low |
6 months |
|
Experience Ratings for the Team (TXEXP/DEXP) |
Personnel Continuity: PCON
The personnel continuity cost driver tries to factor in the changes in personnel during the process life-cycle in a time and resource constrained project. Since we believe that the size of the engineering team in ontology development and taxonomy development are likely the same we use the same values proposed in the original ONTOCOM model.
Cost Driver CCPLX |
Rating |
PCON |
Very High |
10% |
High |
15% |
Nominal |
25% |
Low |
35% |
Very Low |
50% |
|
Personnel Continuity (PCON) |
Language and Tool Experience: TEXP / LEXP
Like ONTOCOM, this cost driver tries to evaluate the experience of the taxonomy developers with the tools at their disposal and/or language e.g. markup languages like XML or languages like OWL. This also includes knowledge of any representation languages used during the processes of identifying and classifying concepts.
Cost Driver CCPLX |
Rating |
TEXP/LEXP |
Very High |
3 years |
High |
2 years |
Nominal |
1 years |
Low |
6 months |
Very Low |
2 months |
|
Tool and Language Experience (TEXP/LEXP) |
3. Project Factors
Support Tools for Taxonomy Development: TOOL
Support tools for taxonomy development during all stages of the development life-cycle surely have a great impact on effort. While we expect that most taxonomy development team have a variaty of tools at their disposal we would like to account for their effectiveness in all the stages of development. A set of tools can include tools for design, implementation and mapping of data to the taxonomy. It can also include tools which help in concept identification.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
High quality and availability of tools, manual intervention minimal |
High |
Few manual processing required |
Nominal |
Basic manual intervention needed |
Low |
Some tool support |
Very Low |
Minimal tool support, mostly manual processing |
|
Tool Support (TOOL) |
Multi-site Development: SITE
Like in ontology development, taxonomy development might require extensive communication between the various parties. The SITE cost driver assesses the communication support tools.
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
frequent F2F meetings |
High |
teleconference, occasional meetings |
Nominal |
email |
Low |
Some tool supportphone, fax |
Very Low |
mail |
|
Multisite Development (SITE) |
Required Development Schedule: SCED
The SCED cost driver accounts for the impact of schedule constraints on the taxonomy development process. Processes which have tight schedule (under 100%) constraints tend to produce more effort in the later stages of the taxonomy development lifecycle like refinement and evolution. Stretched-out schedule usually produce more eort early on like in the identification of concept stage.
Cost Driver CCPLX |
Rating |
SCED |
Very High |
160% |
High |
130% |
Nominal |
100% |
Low |
85% |
Very Low |
75% |
|
Required Development Schedule (SCED) |
4. Reuse Factors
Taxonomy Understanding: TU
This factor takes into account the level of ease with which the taxonomy
to be reused can be understood. The rating scales for taxonomy understanding are given in the following table:
Cost Driver CCPLX |
Rating |
Rating Scale |
Very High |
Representation language tool, 90% comments in natural language |
High |
Representation language tool, 60% comments in natural language |
Nominal |
Representation language tool, 30% comments in natural language |
Low |
Representation language know-how, few comments in natural
language |
Very Low |
Representation language know-how, no comments in natural language |
|
Taxonomy Understanding (TU) |
Taxonomy Developer and Domain Expert Unfamiliarity with Reused Taxonomy: UNFM
This factor
increments the impact of the TU cost driver and expresses the level of unfamiliarity of the engineering team
w.r.t. the taxonomy to be reused (Did the team build it themselves? Does the team work with it every day?
Does the team have only little experience with this taxonomy?). The rating scales for UNFM are given in
the following table:
Cost Driver CCPLX |
Rating |
Rating Scale |
1.0 |
Completely unfamiliar |
0.8 |
Little experience |
0.6 |
Occasional usage |
0.4 |
Every day usage |
0.2 |
Team built |
0.0 |
Self built |
|
TU Increment for Unfamilarity (UNFM) |
Taxonomy modification: TMD
This measure reflects the complexity of the modifications required by the reuse process after the evaluation phase has been completed. The rating scales for TMD are given in the following table:
Rating |
Rating Scale |
Very High |
Excessive modifications |
High |
Considerable modifications |
Nominal |
Some, moderate modifications |
Low |
Some simple modification |
Very Low |
Few, simple modifications |
Taxonomy Modification (TMD) |
Taxonomy Translation: TT
This factor reflects the impact of translating taxonomies between knowledge representation languages. The rating scales for the TT driver are given in the following table:
Rating |
Rating Scale |
Very High |
Manual effort |
High |
Considerable manual effort |
Nominal |
Some manual effort |
Low |
Low manual effort |
Very Low |
Direct |
Taxonomy Translation (TT) |
5. Usage Factors
Deployment type: DEPLOY
This factor reflects the effort to deploy a new taxonomy version to applications which use it. The rating scales for the deployment type driver are given in the following table:
Rating |
Rating Scale |
Very High |
Excessive changes in the taxonomy and applications are interconnected. |
High |
Excessive changes in the taxonomy and applications are independent. |
Nominal |
Moderate changes in the taxonomy and applications interconnectionis irrelevant. |
Low |
Few changes in the taxonomy and applications are interconnected. |
Very Low |
Few changes in the taxonomy and applications are independent. |
Deployment type (DEPLOY) |
Complexity of the Software Adaptation: SADAP
This measure expresses the adaptation that needs to be conducted on the existing corporate software after an taxonomy update and deployment. The rating scales for the SADAP driver are given in the following table:
Rating |
Rating Scale |
Very High |
Existing software must be replaced. |
High |
Existing software must be exhaustedly adapted to ensure it confirms the requirements of the current taxonomy version. |
Nominal |
Existing software must be slightly adapted to ensure it confirms to the requirements of the current taxonomy version. |
Low |
Existing software must be configured towards the current taxonomy version (e.g. changed path or URI). |
Very Low |
No software adaptation is needed. |
Complexity of the Software Adaptation (SADAP) |
Visibility to the User: VISIBIL
This measure expresses the visibility of the taxonomy to the user and costs connected with the user training. The rating scales for the VISIBIL driver are given in the following table:
Rating |
Rating Scale |
Very High |
Regular training with well prepared training material is necessary. |
High |
More intensive training with some training material is necessary. |
Nominal |
Brief training is sucient. |
Low |
Some basic documents as assistance are sucient. |
Very Low |
No training is needed. |
Visibility of the Taxonomy in Use for the User (VISIBIL) |
Complexity of the Instantiation: UDATA
This driver addresses the instance generation during the usage of the taxonomy by employees in the target applications as part of their work. The rating scales for the UDATA driver are given in the following table:
Rating |
Rating Scale |
Very High |
Manual instantiation as additional work, users work directly with the taxonomy and need knowledge about the taxonomy structure. |
High |
Manual instantiation as additional work, users need basic knowledge about taxonomies and semantics (e.g. use of semantic wikis). |
Nominal |
Manual instantiation as part of common workflows. |
Low |
Semi-automatic instantiation, users do not need any knowledge about taxonomies. |
Very Low |
Automatic instantiation. |
Complexity of the Instantiation (UDATA) |
Complexity of the user feedback: UFEED
This measure describes the effort that taxonomy users need to invest to return feedback about a taxonomy. The rating scales for the UFEED driver are given in the following table:
Rating |
Rating Scale |
Very High |
Implicit feedback and explicit feedback (oral & written) via questionnaires, regular reports, and dialog (F2F) with taxonomy developer. |
High |
Implicit feedback and explicit feedback via questionnaires & regular reports. |
Nominal |
Implicit feedback and explicit feedback via questionnaires. |
Low |
Implicit feedback and explicit feedback via brief comments. |
Very Low |
Implicit feedback. |
Complexity of the User Feedback (UFEED) |
Complexity of the Software Engineer Feedback: SEFEED
This measure describes the effort that taxonomy software developers need to invest to return feedback about a taxonomy. The rating scales for the SEFEED driver are given in the following table:
Rating |
Rating Scale |
Very High |
Implicit feedback and explicit feedback (oral & written) via questionnaires, regular reports, and dialog (F2F) with taxonomy developer. |
High |
Implicit feedback and explicit feedback via questionnaires & regular reports. |
Nominal |
Implicit feedback and explicit feedback via questionnaires. |
Low |
Implicit feedback and explicit feedback via brief comments. |
Very Low |
Implicit feedback. |
Complexity of the Software Engineer Feedback (SEFEED) |
Complexity of the Reporting: REP
This driver describes the effort in generating reports which depends on the type of users and software developers feedback. The rating scales for the REP driver are given in the following table:
Rating |
Rating Scale |
High |
Implicit feedback as well as oral and written explicit feedback has to be reported. |
Low |
Implicit feedback and written explicit feedback has to be reported. |
Very Low |
Implicit feedback has to be reported. |
Complexity of the Reporting (REP) |