Tuesday, 13 December 2011


A term that is typically used to encapsulate the constructs of a data model, database Management system (DBMS) and database.

  • Data Definition. Defining new data structures for a database, removing data structures from the database, modifying the structure of existing data.
  • Data Maintenance. Inserting new data into existing data structures, updating data in existing data structures, deleting data from existing data structures.
  • Data Retrieval. Querying existing data by end-users and extracting data for use by application programs.
  • Data Control. Creating and monitoring users of the database, restricting access to data in the database and monitoring the performance of databases.

Technical Heterogeneity

Different file formats, access protocols, query languages etc. Often called syntactic heterogeneity from the point of view of data.

Data Model Heterogeneity

Different ways of representing and storing the same data. Table decompositions may vary, column names (data labels) may be different (but have the same semantics), data encoding schemes may vary (i.e. should a measurement scale be explicitly included in a field or should it be implied elsewhere). Also referred as schematic heterogeneity.

Semantic Heterogeneity

Data across constituent databases may be related but different. Perhaps a database system must be able to integrate genomic and proteomic data. They are related - a gene may have several protein products - but the data is different (nucleotide sequences and amino acid sequences, or hydrophilic/phobic amino acid sequence and positive/negatively charge amino acids). There may be many ways of looking at semantically similar, but distinct datasets.
The system may also be required to present 'new' knowledge to the user. Relationships may be inferred between data according to rules specified in domain ontologies.

No comments:

Post a Comment