Time has been going so fast!!! It is already just a week
left until the courses end. I cannot say whether if that is a good thing or
not; I want to go home so bad yet I do not want to study for my finals. :(
However, it is sorting and efficiency that I will write
about for this week’s entry. Starting week 9, we have been started to looking into
sorting. We started with the binary search trees:
"A binary search tree (BST), sometimes also called an ordered
or sorted binary tree, is a node-based binary tree data structure
which has the following properties:
- The left subtree of a node contains only nodes with keys
less than the node's key.
- The right subtree of a node contains only nodes with keys
greater thanthe node's key.
- The left and right subtree each must also be a binary search tree.
- There must be no duplicate nodes. "
-- (http://en.wikipedia.org/wiki/Binary_search_tree)
I think this is a good, reasonable start for learning about
sorting with recursion. In CSC165, I have always had this question of when
would us encountering a function with a worse runtime of O(log n) (the log base
is usually 2). Now, I know, the binary search tree is a really standard
example. And, yes, it would save a lot of effort for sorting; instead of go
through every element leaves and parents of a tree or list etc. I could just go
down with only one branch. This way is much more efficient. For example, I have
100 items in total in a tree, if I just go through each of them, it would cost
me O(n) runtime (which means 100 steps), but if have the tree organized as a binary
search tree, I could finish my search in log2 100 steps (about 6.643856steps).
The numbers definitely showed the improved efficiency.
(This is a graph I found
online that illustrated the slopes of different runtimes.)
After the binary search tree, we start looking at just
sorting function, which are recursive. We have seen and had some experience
with just sorting without trees (which was not recursive) in CSC108. I could still
remember that we have covered bubble sort, insertion sort and selection sort. In
week 10 and week 11, we have covered three new sorting methods: quick sort,
merge sort and briefly talked about: count sort, Tim sort (python’s own sorting method, a hybrid sorting
algorithm derived from merge sort and insertion sort). Yet, we
focus more on the first two:
Quick sort: choose a pivot; decide
where the pivot goes with
respect to the rest of the list,
repeat on the partitions.
(average runtime: O(n log n);
worst runtime: O(n2).)
Merge sort: divide the list in half,
(merge) sort the halves, then
merge the sorted results
(average runtime: O(n log n);
worst runtime: O(n log n).)
(Above definitions and
data are from lecture notes and Wikipedia.)
Different sorting methods must have their own merits for
dealing with different inputs and situations; I do not think there is a perfect
sorting function that would perform with the shortest runtime for every input. In
Lab #10, the handout wanted us to compare and graph the runtime results for
various sorting methods and with different input (sorted, random, sorted but reversed),
and the graphs look pretty colorful and interesting:
(since
the Merge 1&2, Quicksort 1&2, list.sort results are really close,
I also put them together on separate graphs to screenshot close-ups)
The x axis is the numbers of items for the inputs and the y axis is the numbers of seconds that would take the function to run and complete. One can see, the more flat (linear or constant) the
slope, the more efficient the sorting method. Maybe the inputs we have right
now are short and easy compare some real data that we are going to deal with in
the future. To sort a huge, massive, tons of data, the more efficient the
sorting method, the more time one saved and the less chance that the computer
would make mistakes and being too busy.
I had fun and it was really interesting that we have came
back to talk about sorting and efficiency again after CSC108. Recursion does
not take more time, but save more time. :D
No comments:
Post a Comment