import random

numbers = []
for i in range(10):
    numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Random List
[9, 62, 78, 54, 18, 74, 58, 84, 20, 46]

Warm Up

Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)

  • built in syntax (PANDAS)

Explore

Bubble Sort What is happening with this sort?

  • It repeatedly compares adjacent elements in a list and swaps them if they are in the wrong order.
  • It starts by comparing the first two elements and swapping them if necessary.
  • Then it compares the second and third elements, the third and fourth elements, and so on, until reaching the end of the list.

Merge

  • Advantages
    • time complexity of O(n logn) meaning it can sort though large arrays pretty quickly
    • stable search
  • Process
    • it is a recursive algorithm that continuously splits an array in half until it cannot be divided any further
    • merge sort is done each time the array is halved and then when all halves have been sorted, the merge operation is applied.
    • merge operation takes smaller sorted arrays and combines them into a larger one

Selection

  • starts with a list and finds the smallest element, and swaps it with the 1st element
  • continues to do this until it gets to the largest element

Insertion

  • go from one number to the next and swap it if the number is smaller
  • it goes from item to item in the list and checks that the number is in the right spot

Practice

[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]

How would you sort this list with...

  • Bubble Sort
    • it compares the first and second element, then the second and third, then the third and fourth until it reaches the end of the list. While it compares the numbers, it will swap them if one number is smaller than the other.
  • Selection Sort
    • it will iterate through the list and find the smallest number which it will then swap with the first number in the list. It will then find the next smallest number and put it in the second spot in the list. It will keep doing this till it reaches the largest element.

[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]

How would you sort this list with...

  • Merge Sort
    • you would halve the list again and again until the list cannot be sorted anymore. Then each time it is halved, the merge sort is done on each half so that when it can no longer be halved, everything is sorted. It then merges the halved lists back together into one list.
  • Insertion Sort
    • starts from the first number in the list and compares it to the next number. if the number is smaller, they swap. It keeps going until it goes through the entire list. If a number is not in the right place, the algorithm will check the list and place it in the right spot.

Sorting Words

Sorting strings works in the same way as integers. Using your expertise algorithm, sort the following list of random words.

import nltk
import random

nltk.download('words')  # Download the word list (only required once)

from nltk.corpus import words

english_words = words.words()
#print(len(english_words))  # Prints the number of words in the list

# You can now use the 'english_words' list in your code

word_list = []
for i in range(10):
    word_list.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(word_list)
Random List
['parliament', 'preterpluperfect', 'dragade', 'cooperage', 'spasmodic', 'praesertim', 'munj', 'loveless', 'asporogenous', 'storeship']
[nltk_data] Downloading package words to /Users/vivian/nltk_data...
[nltk_data]   Package words is already up-to-date!
def bubble_sort(word_list):
    n = len(word_list)
    for i in range(n-1):
        for j in range(n-i-1):
            if word_list[j] > word_list[j+1]:
                word_list[j], word_list[j+1] = word_list[j+1], word_list[j]

bubble_sort(word_list)

print("Sorted words:", word_list)
Sorted words: ['asporogenous', 'cooperage', 'dragade', 'loveless', 'munj', 'parliament', 'praesertim', 'preterpluperfect', 'spasmodic', 'storeship']

Discuss

Answer the following with your group.

  • When should you use each algorithm? What makes an algorithm the right choice?
    • time and space complexity
  • Given the following lists...

    • [0, 2, 6, 4, 8, 10]
    • [Elephant, Banana, Cat, Dog, Apple]
    • [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
  • Select the algorithm you believe is best for each, explain.

    • for the first one, you can use insertion which could swap the 6 and 4. Once it reaches the end of the list, the algorithm is finished so it is the most efficient
    • for the second list, selection is the best sorting algorithm because it looks for the smallest item in the list which would be apple since it is alphabetically at a minimum. It would then swap it with elephant and so on
    • for the third list, the best sorting algorithm would be merge because merge works the best on larger data sets. It can keep halving the data set, sorting, and finally merge it back into one list.

HACKS

Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.

  • Now it's time to do some coding...

  • Run code and then research and answer these questions...

    • Is a list and/or dictionary in python considered a primitive or collection type? Why?
      • In Python, both lists and dictionaries are considered collection types
      • A primitive type refers to data types such as integers, floats, booleans, and characters. These types hold a single value and are not composed of smaller elements. On the other hand, collection types are data structures that can hold multiple values or objects.
      • Lists in Python are ordered collections of elements, where each element can be of any data type. Lists are considered collection types because they can store multiple values in a single variable.
    • Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
      • In Python, the list is passed by reference when passed into a function like bubble sort
      • This means that any changes made to the list inside the function will affect the original list outside the function. The reason for this behavior is that when a list is passed to a function, its 'reference' is passed, rather than creating a new copy of the entire list.
      • In the case of bubble sort, as elements are swapped within the function, the original list is modified. The modifications are reflected in the output because the function operates directly on the referenced list
  • Implement new cell(s) and/or organize cells to do the following.

    • Create your own list and sorting algorithm
    • Test your list with my bubble sort
    • Test my list with your new sort
    • Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.

Use the code below to help guide your adventure

"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""

# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
    n = len(list) - 1  # list are indexed 0 to n-1, len is n
    
    # Traverse through list with i index
    for i in range(n):
        swapped = False  # optimize code, so it exits if now swaps on inner loop

        # Inner traversal using j index
        for j in range(n-i):  # n-i as positions on right are in order in bubble
 
            # Swap if the element KeyN is greater KeyN1
            keyN = list[j].get(key)
            keyN1 = list[j+1].get(key)
            if keyN > keyN1:
                swapped = True
                list[j], list[j + 1] = list[j + 1], list[j]  # single line swap
         
        if not swapped:  # if no swaps on inner pass, list is sorted
            return  # exit function
    

if __name__ == "__main__":``
    # list/dictionary sample
    list_of_people = [
    {"name": "Risa", "age": 18, "city": "New York"},
    {"name": "John", "age": 63, "city": "Eugene"},
    {"name": "Shekar", "age": 18, "city": "San Francisco"},
    {"name": "Ryan", "age": 21, "city": "Los Angeles"}
    ]
    
    # assuming uniform keys, pick 1st row as source of keys
    key_row = list_of_people[0]

    # print list as defined
    print("Original")
    print(list_of_people)
    
    for key in key_row:  # finds each key in the row
        print(key)
        bubbleSort(list_of_people, key)  # sort list of people
        print(list_of_people)
Original
[{'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'John', 'age': 63, 'city': 'Eugene'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}]
name
[{'name': 'John', 'age': 63, 'city': 'Eugene'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}]
age
[{'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'John', 'age': 63, 'city': 'Eugene'}]
city
[{'name': 'John', 'age': 63, 'city': 'Eugene'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}]

Creating my own list and sorting algorithm (merge)

def mergeSort(arr, sortkey):
    if len(arr) > 1: # makes sure list isn't alr sorted
        left_arr = arr[:len(arr)//2] #starts at beginning and goes to mid
        right_arr = arr[len(arr)//2:] #starts at middle and goes to end
        
        # recursion for merge sort
        mergeSort(left_arr, sortkey)
        mergeSort(right_arr, sortkey)
        
        # merge step
        i = 0 # keep track of leftmost element in left arr
        j = 0 # keep track of leftmost element in right arr
        k = 0 # merged array index
        
        while i < len(left_arr) and j < len(right_arr): # keep going until one array is at the end of the index
            if compareArrItem(left_arr[i], right_arr[j], sortkey):
                arr[k] = left_arr[i]
                i += 1
            else:
                arr[k] = right_arr[j]
                j += 1
            k += 1 # add 1 to k each time so that next item is added to next index of arr
        
        #checking for leftovers
        while i < len(left_arr): 
            arr[k] = left_arr[i]
            i += 1
            k += 1
        while j < len(right_arr):
            arr[k] = right_arr[j]
            j += 1
            k += 1
            
def compareArrItem(item1, item2, sortkey): #return true if item1 is less than item2
    return item1[sortkey] < item2[sortkey]
    
# my list is a group of users and their enjoyment rating out of 10 of DVASS(our group project) which includes uno, war, black jack, and the memory game as well as how many hours they spent playing the games
dvass_survey = [
    {"name": "Nathan", "rating": 8, "hours": 7},
    {"name": "Max", "rating": 10, "hours": 80},
    {"name": "Kalani", "rating": 7, "hours": 8},
    {"name": "Evan", "rating": 8, "hours": 5},
    {"name": "Jaden", "rating": 10, "hours": 11},
    {"name": "Ryan", "rating": 50, "hours": 50},
    {"name": "Aniket", "rating": 100, "hours": 67}
    ]

mergeSort(dvass_survey, 'rating')
print(dvass_survey)
mergeSort(dvass_survey, 'hours')
print(dvass_survey)
        
[{'name': 'Kalani', 'rating': 7, 'hours': 8}, {'name': 'Evan', 'rating': 8, 'hours': 5}, {'name': 'Nathan', 'rating': 8, 'hours': 7}, {'name': 'Jaden', 'rating': 10, 'hours': 11}, {'name': 'Max', 'rating': 10, 'hours': 80}, {'name': 'Ryan', 'rating': 50, 'hours': 50}, {'name': 'Aniket', 'rating': 100, 'hours': 67}]
[{'name': 'Evan', 'rating': 8, 'hours': 5}, {'name': 'Nathan', 'rating': 8, 'hours': 7}, {'name': 'Kalani', 'rating': 7, 'hours': 8}, {'name': 'Jaden', 'rating': 10, 'hours': 11}, {'name': 'Ryan', 'rating': 50, 'hours': 50}, {'name': 'Aniket', 'rating': 100, 'hours': 67}, {'name': 'Max', 'rating': 10, 'hours': 80}]

Test given list with my new sort

mergeSort(list_of_people, 'age')
print(list_of_people)
[{'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'John', 'age': 63, 'city': 'Eugene'}]

Research analysis on sorting: comparisons, swaps, time. Build this into your hacks. Bubble Sort:

  • Comparisons: Bubble Sort makes a maximum of n*(n-1)/2 comparisons in the worst case, where n is the number of elements.
  • Swaps: Bubble Sort makes a maximum of n*(n-1)/2 swaps in the worst case.
  • Time Complexity: Bubble Sort has an average and worst-case time complexity of O(n^2).

Insertion Sort:

  • Comparisons: Insertion Sort makes a maximum of n*(n-1)/2 comparisons in the worst case.
  • Swaps: Insertion Sort makes a maximum of n*(n-1)/2 swaps in the worst case.
  • Time Complexity: Insertion Sort has an average and worst-case time complexity of O(n^2). However, it performs well on partially sorted or small lists.

Selection Sort:

  • Comparisons: Selection Sort makes a maximum of n*(n-1)/2 comparisons in the worst case.
  • Swaps: Selection Sort makes a maximum of n swaps in the worst case.
  • Time Complexity: Selection Sort has an average and worst-case time complexity of O(n^2).

Merge Sort:

  • Comparisons: Merge Sort makes a maximum of n*log(n) comparisons in the worst case.
  • Swaps: Merge Sort requires additional memory for merging and does not involve element swaps.
  • Time Complexity: Merge Sort has an average and worst-case time complexity of O(n log n).

Find a better way to print the data, key first, then other elements in viewable form.

mergeSort(dvass_survey, 'rating')

def mergePrint(list, key):
   for i, user in enumerate(list):
        print("#" + str(i + 1) + key)
        for k, v in user.items():
            print("\t" + str(k) + ":", str(v))
        print("")

mergePrint(dvass_survey, 'rating')
#1rating
	name: Kalani
	rating: 7
	hours: 8

#2rating
	name: Nathan
	rating: 8
	hours: 7

#3rating
	name: Evan
	rating: 8
	hours: 5

#4rating
	name: Max
	rating: 10
	hours: 80

#5rating
	name: Jaden
	rating: 10
	hours: 11

#6rating
	name: Ryan
	rating: 50
	hours: 50

#7rating
	name: Aniket
	rating: 100
	hours: 67