Mathematics Project Topics

Enhancing Query Performance and Privacy on Relational Cloud Database

Enhancing Query Performance and Privacy on Relational Cloud Database

Enhancing Query Performance and Privacy on Relational Cloud Database

CHAPTER ONE

Aim and Objectives of the research

This dissertation is aimed at enhancing the privacy and query technique on relational cloud database.

The objectives of this dissertation are to:

  1. Design an efficient technique that will enhance the privacy and performance ofSQL queries on cloud database.
  2. Implement a model that minimizes the running time of client/cloud database
  3. Analyze query performance using the conventional querying method versus theproposed system with the Transaction Processing Performance Council (TPC-H)

CHAPTER TWO 

REVIEW OF LITERATURE

 Introduction

This chapter addresses the framework of database outsourcing (database in the cloud), data security, query efficiency and benchmarking, which form the background of the research. It goes further to review literature relating to the aforementioned.

Concept of Cloud Database

A database is an organized collection of data which enables easy access, update and management. It is a structured repository for information with links within the information that help make the data searchable. A Database Management System (DBMS) is a software package with programs that control the creation, maintenance and use of a database. It allows organizations to conveniently develop databases for various applications (Elmasri, 2008).

A cloud database also referred to as Database-as-a-Service (DBaaS or DaaS) is a database that is accessible to clients from the cloud and delivered to users on demand via the Internet from a cloud database provider’s servers. It typically runs on a shared cloud computing platform such as Windows Azure, Amazon EC2, GoGrid, Google Cloud SQL, all of which comprise hardware and software (Mateljan et al., 2010). The cloud platform is structured to host multiple outsourced databases by providing database as a specialized service or providing virtual machines to deploy any databases on.

Database services on the cloud are provided with automated features which enable optimized scaling, high availability, multi-tenancy and effective resource allocation. The services have the advantages of increased accessibility, automatic failover and fast

automated recovery from failures, automated on-the-go scaling, load balancing, minimal investment and maintenance of in-house hardware, better performance over the traditional database. Some potential drawbacks include security and privacy issues, potential loss/no access to data in event of disaster, internet outage or bankruptcy of cloud database service provider (Phan, 2013).

The cloud database is constructed by collecting a number of sites also called nodes which are interlinked by a communication network. Every node is a database class. Each class has its own database, terminals, the central processor and their individual local database management system (Donkena and Gannamani, 2012).

Types of Cloud Database

Cloud databases are categorized based on the model. Some are SQL based and some use the NoSQL data model.

SQL Databases

Structured Query Language (SQL) was developed in the 1970’s by IBM when Edgar Codd introduced the relational data model. It is a standard query language for relational database management systems (RDBMS) (Codd, 1970).

In the relational model, data are organized into relations, each represented by a table comprising rows and columns. It also has a key, which is used to map data to other relations. SQL is used to make queries to the database such as creating, reading, updating and deleting data. SQL supports indexing mechanism to speed up reading operations, or creating views which can join data from multiple tables and other features for database optimization and maintenance (Elmasri, 2008).

One important attribute of SQL databases is that they follow the Atomicity, Consistency, Isolation and Durability (ACID) rules (Codd, 1970).

  1. Atomicity:A transaction is a logical unit of work which must be either completed with all of its data modifications, or none of them is performed that is, transactions are all-or-nothing
  2. Consistency:   Before and after a transaction, all data must be left in a stable Roll-back occurs in event of failure.
  3. Isolation:Modifications of data performed by a transaction must be independent of another transaction (without interference). Unless this happens, the outcome of a transaction may be erroneous
  4. Durability:When the transaction is completed, effects of the modifications performed by the transaction must be permanent in the system (committed). Committed transactions cannot be lost. Track of changes are kept in logs to ensure reliability of data at any point in time.

CHAPTER THREE 

ARCHITECTURE AND DESIGN OF SEQURE SQL

 Introduction

This chapter starts by introducing the framework of the proposed design known as SecureSQL. The proposed system model which comprises the design is explained. The architecture of the system which gives details of the algorithms and the major components involved is also presented. The tools and platform used in the implementation of the system is also highlighted.

System Model

At the most rudimentary level, a cloud storage system just needs one data server connected to the Internet. A subscriber copies files to the server over the Internet. When a client wants to retrieve the data, the client accesses the data server with a web-based interface, and the server then either sends the files back to the client or allows the client to access and manipulate the data itself.

Figure 3.1 gives an overview of the proposed system model. The model employs the client-server architecture. The client is trusted and the server is not trusted. This implies that the server is honest but curious. Most of the computations are to be performed on the client side. Data is encrypted on the client side using selective attribute-based encryption, before deployment to the cloud server. The client layer consists of the application which is accessed through a web browser. It connects through the internet to the Cloud Service Provider (CSP) and then to the data center, both of which make up the server side. When the owner is uploading or updating data, the application interface allows him to select the entity or relation he wants to encrypt.

CHAPTER FOUR

 IMPLEMENTATION, RESULTS AND DISCUSSION

 Introduction

This chapter begins by generating the test data used for the database. It goes on to the software and system development by explaining the implementation of the client side application and its connection to the cloud database. Furthermore, testing of the queries is done by analyzing and evaluating the performance using statistical tools. In conclusion, the experimental results arrived at, are critically discussed.

CHAPTER FIVE

SUMMARY, CONCLUSION AND RECOMMENDATION

  Summary

The efficiency of data retrieval in large outsourced databases, especially as it relates to the privacy of such data has been an open challenge primarily due to the fact that traditional query languages cannot work with encrypted data. Reviewed literature showed that most of the architectural models and encryption techniques proposed to ensure efficient performance of queries as well as privacy of outsourced data have limitations which range from high computational overhead to restricted query computation.

Based on these limitations, this research proposed an architectural framework which focused on enhancing server-side data retrieval and query efficiency through the use of hash map and AES 128-bit encryption algorithm. The implementation of this framework known as secureSQL was built on the client-side without any alteration to the DBMS structure. Server-side data retrieval was made possible through a simple graphical user interface to encrypt data before uploading to a cloud server; perform queries and decrypt encrypted values on the fly. A query processing engine was also developed to process encrypted data. Selective attribute-based encryption which was incorporated into the encryption algorithm ensures that computational overhead is minimized.

Observation of the results obtained (refer to section 4.9) show that there is a significant variation between the execution time using the traditional method of decrypting entire records before performing queries and the time using the proposed method. Also, with increasing number of records, the proposed method maintains some degree of constancy in execution time thereby supporting the O(1) time complexity assertion for the use of the hash map data structure (refer to section 2.13 and section 2.6). A tabular and graphical comparison (refer to section 4.10) with some related works show that the proposed method scores more points.

SecureSQL model guarantees efficiency and is able to execute 20 out of the 22 TPC-H benchmark queries while ensuring privacy. This is proof that it is not restricted to simple query constructs but is able to handle even complex queries involving nested sub queries and joins.

In a nutshell, this research has dealt with relational databases that are hosted on the cloud; how they are secured, the effects of the security on data retrieval and further proposed methods to reduce these effects to a minimal level.

Conclusion

The recent explosion of digital content ownership has both increased the popularity of data outsourcing and fueled concerns over data security. The need to facilitate storage and processing of large amounts and types of sensitive data is of particular importance in modern enterprise especially where the server is not trusted and client resources are limited. This research worked at eliminating the tradeoff between security and efficient query processing through its secureSQL design as is evidenced in the results. The design of an efficient technique to enhance the privacy and performance of SQL queries on cloud databases which was the first objective has been achieved through the combined use of Selective Attribute-Based Encryption, Hash Map structure and AES- 128 bit encryption algorithm. The model was implemented using all the implementation tools in section 3.7, thus, the second objective is also achieved.

Finally, the observations in section 4.9 show a wide margin in the query performance using the conventional method and the proposed method. The third objective is also achieved.

As long as the performance degradation due to encryption is under control, users would choose to use outsourced databases to store private information rather than traditional databases.

Recommendation for future work

Some aspects related to this research could not be investigated as they were out of scope while others were investigated but not implemented due to limitation of time. These aspects have therefore been recommended as future work outlined:

  1. Explore further techniques of data retrieval security and efficiency that deal withmultimedia databases (images, music and videos) because of the recent explosion of mobile technology and rapid expansion of online media
  2. Design of an intelligent system that determines which columns are sensitive andautomatically encrypts the
  3. Make the encryption/decryption keys dynamic so that they could be set from theapplication user interface.
  4. With advances in web computing and mobile technology, XML is widely beingused as a standard to exchange data over inter-networked heterogeneous systems and web based applications. The hash map method could be extended to develop query execution techniques over the encrypted XML
  5. Investigate and analyze similar techniques implemented in this work for non-relational
  6. A data partitioning function which defines the conditions for data sensitivity canbe incorporated to determine the sensitivity of data for certain attributes before applying the encryption function.

 

REFERENCES

  • Aderounmu G.A.(2012). “Cloud Computing: A Definition, Challenges and Opportunities”. [PowerPoint slides]. Available from: http://www.cpn.gov.ng/?page=show&cat=6&subc=20
  • Aggarwal G., Bawa M., Ganesan P., Garcia-Molina H, Kenthapadi K., Motwani R. and Xu Y. (2005). Two can keep a Secret: A Distributed Architecture for Secure Database Services. Retrieved August 20, 2014 from: https://database.cs.wisc.edu/cidr/cidr2005/papers/P16.pdf
  • Agrawal R., Evfimievski A. and Srikant R. (2003). Information sharing across private databases. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, SIGMOD ’03, pages 86–97, New York, NY, USA, 2003.
  • Agrawal R., Kiernan J., Srikant R. and Xu Y. (2004). Order Preserving Encryption for Numeric Data. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, SIGMOD ’04, pages 563–574, New York, NY, USA, 2004.
  • Al Tamimi, A. (2003). “Performance Analysis of Data Encryption Algorithms”. Retrieved April 23, 2015 from: http://www.cse.wustl.edu/~jain/cse567- 06/encryption_perf.htm.
  • Alhanjouri M. and Al Derawi A.(2012) “A New Method of Query over Encrypted Data in Database using Hash Map”, International Journal of Computer Applications (IJCA)41(4): 46-51, March 2012. Published by Foundation of Computer Science, NY, USA. Retrieved from: http://research.ijcaonline.org/volume41/number4/pxc3877580.pdf
  • Badger L., Patt-Corner R., Grance T. and Voas. J. (2011). “DRAFT Cloud Computing Synopsis and Recommendations. Recommendations of the National Institute of Standards and Technology” (NIST). May, 2011. Retrieved from: http://csrc.nist.gov/publications/drafts/800-146/Draft-NIST-SP800-146.pdf
  • Bell R. (2009). A Beginners guide to Big O notation.[web log post]. Accessed August 2, 2015 from: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
  • Bellovin S.M. (2009). Modes of Operation [Power point slides]. Retrieved September 12, 2014 from: https://www.cs.columbia.edu/~smb/classes/s09/l05.pdf.
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!