Journal of University of Science and Technology of China ›› 2016, Vol. 46 ›› Issue (1): 56-65.DOI: 10.3969/j.issn.0253-2778.2016.01.008

• Original Paper • Previous Articles    

Tabular-oriented data model and its query issues

HUANG Dongmei, SUN Le, SHI Shaohua, SU Cheng, ZHAO Danfeng   

  1. 1. School of Information, Shanghai Ocean University, Shanghai 201306, China; 2. East Sea Forecast Center of Oceanic Administration of China, Shanghai 200136, China
  • Received:2015-08-27 Revised:2015-09-29 Accepted:2015-09-29 Online:2015-09-29 Published:2015-09-29

Abstract: With the rapid development of information technologies, data storage and representation of various sources, including not only the traditional structured data such as relational databases and object-oriented databases, but also those special unstructured data like Excel, CSV documents, manifest distributed and heterogeneous characteristics. Undoubtedly, all above data features high-volume, continuously-updating, low-usability, which falls into Big Data. However, the organization and management of Excel and other forms of data by using unstructured and semi-structured methods leads to a weakly-controllable, weakly-usable data structure with poor access efficiency. To solve this problem, this paper, taking Excel data source into consideration, aims to propose a new tabular-oriented relational data model and discusses Tabular querying and optimizing issues. Firstly, the formal definition of Tabular form data is given; secondly, PartiPath tree is designed to achieve structural transformation by tabular division and its relation schema as well; then its data model is presented. After that, four basic queries and their optimization by improved DICE with user interest similarity are described. Finally, the experiment was conducted and a conclusion was drawm.

Key words: Tabular repository, query, data model, PartiPath tree, relation model

CLC Number: