406706 Physical Database Design
This assignment aims to develop your skills optimizing query.
It involves identifying the suitable methods for structuring the data and writing the query.
First, you will need to create a query on some unstructured data. Unstructured data can be thought of in this context as a table which is NOT clustered or hashed. In addition the table should NOT contain any indexes at all. The query can be an equi-join or a sub-query.
The time a query with unstructured data takes to execute (execution time) will be used as the baseline.
Plan how to set up indexes, clusters, and hashed clusters for the same query. Use different numbers of rows and graph these for each type of structure. You need to create enough rows to indicate a trend in the performance (Number of rows should be enough).
You should then run the query and note the execution times for EACH of these structures.
Record the results. When doing this you could alter the way you set up some of these structures by, for instance, altering the hashkey or what you cluster together and note any differences in execution time.
You should by now have determined the optimal data structure in terms of performance. To be confident that your data will make things difficult for Oracle, ensure that:
- the data is as unordered as possible
- you retrieve a number of rows
You now have some choices about how you might improve the SQL statement. Take the best optimisation and try to improve it by inserting hints and swapping the order of the tables in the 'from' clause of your query.
The best of these 'improved' statements then needs to be tried out using Oracle's Parallel Query Option.
Please note:
You need to write explanation (Brief explanation) on each event. Just handing in a list of a TRACE without saying what it all means will make you lose your mark. Your comment on each event is as important as the experiment itself.
In your report as the conclusion part, you need to write (max length: 300 words) that discusses the effectiveness of Oracle’s query optimizer.
Assignment 2 Marking Schedule
Points to note when designing the tests
- The baseline query must take a sufficiently long time to allow improvements to occur later when tuned.
- The query will need to involve a join (s) of tables so that all data structures can be used.
- You must rebuild the tables and reload the data between tests when altering the data structures.
- Tests can vary in their total time and it is safer to run each test 2-3 times to ensure the execution time is consistent.
- The data must stay the same for all the tests otherwise it will not be possible to make comparisons.
- You may need to make design changes to your tables before all the tests will work.
- If you leave the assignment to the last minute you will find the server responses slow because of all the other students also completing assignments at this time
Name: | |||
Item |
Max Mark |
Your Mark |
Comments |
Preliminary work: | |||
SQL statement complexity |
10 | ||
Structures: | |||
nil setup |
4 | ||
Design of experiment |
2 | ||
observations/conclusions |
2 | ||
indexed setup |
10 | ||
Design of experiment |
6 | ||
observations/conclusions |
6 | ||
clustered setup |
12 | ||
Design of experiment |
6 | ||
observations/conclusions |
6 | ||
hash cluster setup |
12 | ||
Design of experiment |
6 | ||
observations/conclusions |
6 | ||
Statement Tuning: | |||
Hints |
9 | ||
Design of experiment |
5 | ||
observations/conclusions |
5 | ||
Table Order |
9 | ||
Design of experiment |
5 | ||
observations/conclusions |
5 | ||
Parallel Query Option |
9 | ||
Setup/ Design of experiment |
5 | ||
observations/conclusions |
5 | ||
Overall Reflection on Oracle’s Query Optimiser |
12 | ||
General Documentation |
8 | ||
Total |
100 |