Friday, March 28, 2014

Week 11 - Sorting and Efficiency

Time has been going so fast!!! It is already just a week left until the courses end. I cannot say whether if that is a good thing or not; I want to go home so bad yet I do not want to study for my finals. :(
However, it is sorting and efficiency that I will write about for this week’s entry. Starting week 9, we have been started to looking into sorting. We started with the binary search trees: 

                     

           "A binary search tree (BST), sometimes also called an ordered
  or sorted binary tree, is a node-based binary tree data structure 
  which has the following properties:
      -    The left subtree of a node contains only nodes with keys 
           less than the node's key.
      -    The right subtree of a node contains only nodes with keys 
           greater thanthe node's key.
      -    The left and right subtree each must also be a binary search tree.
      -    There must be no duplicate nodes. "
                             -- (http://en.wikipedia.org/wiki/Binary_search_tree)
I think this is a good, reasonable start for learning about sorting with recursion. In CSC165, I have always had this question of when would us encountering a function with a worse runtime of O(log n) (the log base is usually 2). Now, I know, the binary search tree is a really standard example. And, yes, it would save a lot of effort for sorting; instead of go through every element leaves and parents of a tree or list etc. I could just go down with only one branch. This way is much more efficient. For example, I have 100 items in total in a tree, if I just go through each of them, it would cost me O(n) runtime (which means 100 steps), but if have the tree organized as a binary search tree, I could finish my search in log2 100 steps (about 6.643856steps). The numbers definitely showed the improved efficiency. 

                          
 (This is a graph I found online that illustrated the slopes of different runtimes.)
After the binary search tree, we start looking at just sorting function, which are recursive. We have seen and had some experience with just sorting without trees (which was not recursive) in CSC108. I could still remember that we have covered bubble sort, insertion sort and selection sort. In week 10 and week 11, we have covered three new sorting methods: quick sort, merge sort and briefly talked about: count sort, Tim sort (python’s own sorting method, a hybrid sorting algorithm derived from merge sort and insertion sort). Yet, we focus more on the first two:
Quick sort: choose a pivot; decide where the pivot goes with
respect to the rest of the list, repeat on the partitions.
(average runtime: O(n log n); worst runtime: O(n2).)
Merge sort: divide the list in half, (merge) sort the halves, then
merge the sorted results
(average runtime: O(n log n); worst runtime: O(n log n).)
(Above definitions and data are from lecture notes and Wikipedia.)

Different sorting methods must have their own merits for dealing with different inputs and situations; I do not think there is a perfect sorting function that would perform with the shortest runtime for every input. In Lab #10, the handout wanted us to compare and graph the runtime results for various sorting methods and with different input (sorted, random, sorted but reversed), and the graphs look pretty colorful and interesting:

           
(since the Merge 1&2, Quicksort 1&2, list.sort results are really close, I also put them together on separate graphs to screenshot close-ups)
The x axis is the numbers of items for the inputs and the y axis is the numbers of seconds that would take the function to run and complete. One can see, the more flat (linear or constant) the slope, the more efficient the sorting method. Maybe the inputs we have right now are short and easy compare some real data that we are going to deal with in the future. To sort a huge, massive, tons of data, the more efficient the sorting method, the more time one saved and the less chance that the computer would make mistakes and being too busy.
I had fun and it was really interesting that we have came back to talk about sorting and efficiency again after CSC108. Recursion does not take more time, but save more time. :D

Sunday, March 23, 2014

Week 10 - Assignment 2 Part 1

It is probably seems to be a bit late to write about Assignment 2 Part, since it had been like 2 weeks and Part 2 was already due. Yet, Part 1 seems to be hard to start compare to Part 2, and this is the reason why I want to write about this.
 Assignment 2, overall, introduced a very interesting concept of organizing linked lists and classes of string:
“Regular expressions (abbreviated to regex, the pronunciation of which gives rise to endless ame wars. . . ) are used in various programming languages and utilities to match entire classes of strings. This assignment will give you experience modelling a regular expression as a tree, and detecting which strings match a given regular expression.”                           
                                                                 --- (Assignment #2 Handout)
                                http://www.cdf.toronto.edu/~heap/148/W14/Assignments/A2/a2.pdf
In part 1, we simply just have “design a collection of classes to represent the various sorts of regular expression trees. Each class should implement or inherit an __eq__ method so that it can be compared to other objects; and a __repr__ method so that it can be represented in a meaningful way as a string, and so that an equivalent tree will be produced if you cut-and-paste the representation into a Python shell. You should carefully consider how to use inheritance to reduce the amount of duplicated code. You should also ensure that public attributes for a tree's symbol or children can only be set once, during initialization. After that, they should be read-only.Your class(es) should be declared in a _le called regex design.py.” (quoted from the handout).
I have say it was really hard to start this step, I think maybe because it is the first time I was not give any file and just implements functions/methods, I have to actually design the structure of this, even though this does not seem hard at all. I also got confused with part 2, so I kept doing part 2’s job (eg. translate input into a tree and etc.); I over-complicated thing a bit.
I started by just trying out different things and design the classes in different ways, yet the far I go, the far I got lost (still confused the different parts I need to do for part 1 and part 2). So, I deleted everything and decided to start fresh. I have a friend in UT Scarborough, who is taking 148, too, and I ask him about how to start this, and I also went on to Piazza and spend three hours looked through all the posts and answers about Part one. Finally, after five hours, I finally knew what I needed to do. Ironically, it only took me one hour the whole thing. :P And it was actually pretty simple. :D
I think the handout did not explain everything to well. For example, what do we suppose to return when we use __repr__(), what suppose to be the input, and small things that are similar to these kind of questions. So, I hope in the future, the handout could have been more clear about the steps and explanation of the everything.
By the way, I really LOVE Piazza!!!! Every time I got confused about something, like assignments, exercises or labs, I go on there, and ask or look over people’s post, and it is really helpful for my studying in this course.

Friday, March 14, 2014

Week 9 - Lab 8

This week is kind of busy for me…. T^T Yet, I have been quite enjoy to be busy. :D As long As there is no tests or exams. :D


Anyway, this week, nothing every important or special happened; yet, this week’s lab is kind of interesting. The handout asked us to write the function/method count_less(self, item) for three times. This function /method needs to return the number of nodes in this BST with items that are less than given item.
The first time, we need to write it as a recursive method in the file BST rec1.py within the classBST(object), and we need write a nested helper function within count_less(self, item) without changing or adding anything to the class _BSTNode.
The second time, we still need to write it as a method in the file BST rec2.py within the classBST(object), but we need to write the helper method in class _BSTNode and calling the helper within count_less(self, item) and without changing or adding anything to the class BST.
Finally, the last time, we will implement method count_less(self, item) in class BST, yet we also need to write a helper method in both class _BSTNode and _BSTNone and calling the helper within count_less(self, item) without changing or adding anything to the class BST.
My partner and I were confused at first, wondering why we need to do the same thing again and again in three difference ways. Things turned out to be really interesting: the first time is just a bit challenging for us to come up with the solution, the second way is just want us to learn the way to import a helper method within another class. And the last method is trying to help us develop another way of thinking, divide the method into two situation: node is None or node is actually another parent or a leaf, and call the different helper methods depend of the node.
By the end, I felt pretty beneficial from this lab, if I were not had this lab, I would just put everything in one class, yet I now kind of really understand why we put different things in different classes and inherit from each other. :D


Sunday, March 9, 2014

Week 8 - Exercise 3

This week was horrible for me…. :(
I got sick with cold and fever on Tuesday night and I have to miss Wednesday’s lab since I was feeling really awful.
However, I am glad that I am finally feeling much better today (Sunday). :D I think I am gonna be back to normal next week sometimes.
I looked over Exercise 3 during the weekend and finished a draft for it. This exercise is about TreeList, yet it is different than the example the professors did in class: translate a TreeList to a list of the nodes; of course, we have choices between preoder, inorder and postorder. Yet, in this exercise, we suppose to do the opposite: by providing the preorder and inorder node lists, we convert them back to a single TreeList.  Now, I am just waiting nervously for the autograding result. :P
I always love the exercises better than the assignments, because they are much more “friendly”.  :D   Assignments just always make me nervous and feel like I am going to fail the course. I sincerely hope the assignments in the future courses can be “friendlier” and less stressful for us. My group finished Assignment 2 Part 1 already, yet just by looking at Part 2, I already feel kind of stress.  :(   


Sunday, March 2, 2014

Week 7 - Recursion

After an awesome week of reading break, week 7 was totally stressful and tense. I had 4 tests in 3 days in a roll…… I would never want that to happen ever again…… :(  Luckily, that was all finished.
Since week 4 ish, the professor already started to discuss about recursion with us in lectures. Recursion was something new to all of us; to think of writing codes that would use itself as part of the codes over and over again was novelty and cool to me at first.
According to http://dictionary.reference.com, recursion is defined as:
“the process of defining a function or calculating a number by the repeated application of an algorithm.”
In python, as far as we have learnt, we have seen functions which they call other functions in the same (or imported) files. Recursions is basic to call python’s ability for a function to call itself. 


It was kind of hard to imagine what it is when the professor just introduced this topic, yet he showed us a function called rec_max: it find the maximum number in the nested list from the input. This is a pretty typical example for recursion. The function would look through all the nested lists within the input to make sure that the number it returns would be the biggest of all:    
Another thing that the prof brought up a very interesting function, “Turtle”. Sorry that I could not find a correct image for the visualization of “Turtle”. Yet I found a similar one to demonstrate what does the code do:
We first set the speed and the color of the pathway of the turtle when we initialize it from class Turtle. Then, we input how many levels (recursions) and base (length of the pathway). Basically, if level equals to 1, the turtle would just draw 3 lines in 3 directions (120o apart from each other) with 3 different colors of the base’s length. If the equals to 2, the turtle would add another half of the lengths and draw 3 more branches with 3 lines the turtle drew when the level equals to 1. I made some pictures from level 1 to level 4:

       

   

As one can see, as the input of the level gets larger, the resulting picture will slowly form a huge triangle.
          In sum, recursion is just a smart trick for the programmers to use when they have to deal with nested things (lists, tuples etc.) or situations that would involve in break it down to tons of smaller pieces.