Lambda Functions

This lesson will teach you how to create and use lambda functions along with map and filter functions

For Loop Review



    While creating the Hangman project, you learned how to use for loops to go through lists of data. Here's an example of the code that was used to see if you found all the letters necessary to win the game.

    foundAllLetters = True
    for letter in secretWord:
      if letter not in correctLetters:
        foundAllLetters = False
        break
                    

    For loops are very important in programming and processing information from lists of data.

    When you start learning about machine learning and data science, you will need to process large amounts of data that would be too difficult to process by hand.

    This lesson is all about how you write programs that effectively go through lists and do things with the data in those lists.

Functions as Variables


    You've already learned about many different types of variables, such as numbers, strings, lists, and other data types.

    x = 5
    name = "John Smith"
    city_list = ["Boston", "Miami", "Chicago"]
                    

    Did you know you can also store functions into variables, and then add arguments to those variables and have them return results?

    def add_one(x):
        return x + 1
    
    add_one_function = add_one
    add_one_function(5)
    
    # Output:
    # 6
                    

    Why would you want to store a function as a variable? That would only really be useful if you used the function as an argument for another function.

    Say that you wanted to find all numbers in a list larger than a certain number. You could use a for loop to look through the list and add them to a new list based on that condition.

    Then, at the end, you print out your new list of numbers greater than that amount.

    number_list = [2, 5, 8, 10, 14]
    new_number_list = []
    
    for number in number_list:
        if (number > 7):
            new_number_list.append(number)
    
    print (new_number_list)
    
    # Output:
    # [8, 10, 14]
                    

    If you wanted to, you could make the if statement in the for loop a function. Because the check that you are performing inside of the if statement is simple, there's not much reason to do this. With more complicated conditions you might consider putting the check in a separate function like this.

    In the below code, the greater_than_7 function does one thing, checks to see if a number is greater than 7.

    number_list = [2, 5, 8, 10, 14]
    				
    def greater_than_7(number):
        return number > 7
    
    new_number_list = []
    for number in number_list:
        if greater_than_7(number):
            new_number_list.append(number)
    
    print (new_number_list)
    
    # Output:
    # [8, 10, 14]
                    

    Returning only part of a list is a common problem that you will have to solve in data science. You will have large amounts of data to look at, and you will need to see only pieces of data based on a condition.

Filter: Get Items from a List


    The filter function enables you to look through a list and return the elements of the list based on your condition.

    But there's a catch, the filter function takes in 2 arguments. The second argument is the list you want to look through, but the first argument is a function. This is the case where you need to define that function and have it be a variable before you perform the filter.

    After you call the filter function, you will need to specify that you want the results as a list, because filter returns a special filter object. This is done by python because creating a list can take a lot of processing power for larger data sets where you use filter. In our case, it's easy to make a list by wrapping the filter in list().

    number_list = [2, 5, 8, 10, 14]
    				
    def greater_than_7(number):
        return number > 7
    
    filtered_list = list(filter(greater_than_7, number_list))
    
    print(filtered_list)
    
    # Output:
    # [8, 10, 14]
                    

    In the example with the for loop, it took four lines to create a new list and iterate through the list to get our final result. By using filter, you can do all that in one line, so long as you have the function you want to use for your comparison.

    But it's sort of awkward to define such a simple function just so we can go through the list and determine whether the elements match a condition. This is where you can use lambda to define a temporary function. A lambda function can do the same thing as the greater_than_7 function, but it doesn't need to be defined in advance like the named function.

    Lambda functions are created by typing the word lambda, the name of the parameter variable, a colon, and then what you want the function to do. The lambda function will automatically return the result of the code to the right of the colon.

    You can also store lambda functions into variables, if you want. Below is an example of a lambda function that does the same thing as the greater_than_7 function above.

    lambda_greater_than_7 = lambda number: number > 7
    
    lambda_greater_than_7(8)
    
    # Output
    # True
                    

    Hopefully you can see where this is going. If you can save a lambda function into a variable, that means you can also create a lambda function as an argument for the filter function.

    In the first part of this code, the lambda function is the first argument, and the entire for loop that took multiple lines before can now be expressed in a single line of code.

    filtered_list = list(filter(lambda number: number > 7, number_list))
    
    print(filtered_list)
    
    # Output:
    # [8, 10, 14]
                    

    Why is expressing the for loop in one line useful? When you perform data science and machine learning, you will often be looking at large datasets and trying to get subsets of data based on conditions. If you had to write multi-line for loops every single time you wanted to look at the data in a different way, it would make the code a lot longer than it needed to be.

    One more thing about filter: the second argument of the filter function isn't limited to just lists. If you have any object where you can use a for loop, you can use the filter function over those elements. For example, you can use a for loop over a string to remove a certain character from the string.

    In the below example, the filter returns a list of all the characters, except for the space character, since the lambda function condition is that the character is NOT a blank space.

    example_string = "It was a dark and stormy night."
    
    filtered_string = list(filter(lambda character: character != ' ', example_string))
    
    # Output:
    # ['I', 't', 'w', 'a', 's', 'a', 'd', 'a', 'r', 'k', 'a', 'n', 'd', 's', 't', 'o', 'r', 'm', 'y', 'n', 'i', 'g', 'h', 't', '.']
                    

Map: Change Items in a List



    Sometimes you want to perform actions on a list of data, not just filter out data. Say you came up with a list of nice compliment adjectives to describe someone, but wanted to change them all to be the superlative version of those adjectives. It's not enough to say someone is smart, you want to say they're the smartest! You can change each item in the list of compliments by using a for loop if you wanted to.

    compliments_list = ["Smart", "Kind", "Fair", "Great", "Fast"]
    
    new_compliments_list = []
    for compliment in compliments_list:
        new_compliments_list.append(compliment + "est")
    
    print (new_compliments_list)
    
    # Output:
    # ["Smartest", "Kindest", "Fairest", "Greatest", "Fastest"]
                    

    You can see the pattern here. Create a new list, and then for each item in that list, add the item to a new list that is updated based on the change of the original word.

    You can use the Map function here instead of the for loop. The Map function, like filter, takes in two arguments, The first argument is the function that you want to perform and the second argument is the list of elements you want the function to update. Because you can use lambda, you can define the three lines necessary to use the for loop in one line.

    compliments_list = ["Smart", "Kind", "Fair", "Great", "Fast"]
    
    mapped_list = list(map(lambda word: word + "est", compliments_list))
    
    print (mapped_list)
    
    # Output:
    # ["Smartest", "Kindest", "Fairest", "Greatest", "Fastest"]
                    

    Those all sound like nice things to say to someone. But maybe you want to add some more compliments that have a different way of expressing the superlative form of the word.

    For example, you want add some longer compliments to the list, such as Brilliant and Extraordinary. If you add these elements to the list, you end up with some strange words.

    compliments_list = ["Smart", "Kind", "Fair", "Great", "Fast", "Brilliant", "Extraordinary"]
    
    mapped_list = list(map(lambda word: word + "est", compliments_list))
    
    print (mapped_list)
    
    # Output:
    # ["Smartest", "Kindest", "Fairest", "Greatest", "Fastest", "Brilliantest", "Extraordinaryest"]
                    

    Brilliantest definitely isn't a real word. Instead, we want to say Most Brilliant. You can add a one-line if statement to the map function so that it does different things based on how many characters are in the compliment word.

    The format for this if statement is a little different than then the multi-line if statements you have used. It starts with the default action, then the condition, then the else code to run if the condition isn't satisfied.

    The format works like this: (Statement) if (Condition) else (Statement)

    compliments_list = ["Smart", "Kind", "Fair", "Great", "Fast", "Brilliant", "Extraordinary"]
    
    mapped_list = list(map(lambda compliment: compliment + "est" if len(compliment) < 6 else "Most " + compliment, compliments_list))
    
    print(mapped_list)
    
    # Output:
    # ['Smartest', 'Kindest', 'Fairest', 'Greatest', 'Fastest', 'Most Brilliant', 'Most Extraordinary']
                    

    Most Brilliant is a much nicer thing to say. Using an if statement in your lambda function lets you do different things to elements of the list based on particular conditions.

    A word of warning though: once your calculations start to get more complicated than one if statement, it might be time to save a function as a variable and pass it into your map function.

Advanced Example



    Right now, you still may not be convinced about using these functions. The for loop is very versatile, and can do a lot. So in many cases, writing for loops rather than using map and filter may be the right choice.

    Let's create a more complicated example. Run the below code inside of a new file, and it will create a list of Person objects with names and ages.

    class Person:
        def __init__(self, name, age):
            self.name = name
            self.age = age
    
    person_list = []
    person_list.append(Person("Hassan Sherman", 38))
    person_list.append(Person("Alexandria Correa", 57))
    person_list.append(Person("Samuel Ferguson", 16))
    person_list.append(Person("Xanthe Greaves", 12))
    person_list.append(Person("Zainab Glover", 73))
    person_list.append(Person("Kiara Foreman", 18))
    person_list.append(Person("Arthur Joseph", 13))
    person_list.append(Person("Kristin Everett", 54))
    person_list.append(Person("Selena Hook", 32))
    person_list.append(Person("Talhah Christian", 89))
    person_list.append(Person("Silas Hassan", 35))
    person_list.append(Person("Raiden Benson", 38))
    person_list.append(Person("Omari Owen", 21))
    person_list.append(Person("Patryk Wood", 19))
    person_list.append(Person("Adam Carpenter", 52))
                    

    When you start to analyze large data sets, you may need to go through lists of very complex objects.

    For example, what if you want to find the names of all the teenagers in the list? You could do this with a for loop, of course.

    new_person_names_list = []
    for person in person_list:
        if person.age > 12 and person.age < 20:
            new_person_names_list.append(person.name)
    
    print(new_person_names_list)
    
    # Output:
    # ['Samuel Ferguson', 'Kiara Foreman', 'Arthur Joseph', 'Patryk Wood']
                    

    You could perform the same action with a combination filter, to get the people with the right age, and a map, to create a list with the the people's names.

    In the below example, the input to the map function's second argument is the output of a filter function.

    filter_map_names_list = list(map(lambda person: person.name, filter(lambda person: person.age > 12 and person.age < 20, person_list)))
    
    print(filter_map_names_list)
    
    # Output:
    # ['Samuel Ferguson', 'Kiara Foreman', 'Arthur Joseph', 'Patryk Wood']
                    

    At this point you might be thinking that these lines are getting awfully long. Having long lines of text can be hard to read sometimes, but single-lines of code can be put directly in the shell and run rather writing them inside of a python script file.

    This below line of code will return the last names of every person in the list. Instead of writing it in your code file, copy the line into your interpreter and see the results.

    print(list(map(lambda person: person.name.split(' ')[1], person_list)))
    
    # Output
    # ['Sherman', 'Correa', 'Ferguson', 'Greaves', 'Glover', 'Foreman', 'Joseph', 'Everett', 'Hook', 'Christian', 'Hassan', 'Benson', 'Owen', 'Wood', 'Carpenter']
                    

    Rather than running your filters inside of a module, which may include lots of code you don't need to run, you can write a 1-line filter inside of the python interpreter to see your results!

    This is helpful when you write long programs that perform lots of changes on the data and takes a while to process, and you don't want to run the whole program every single time. You can type simple 1-line filter commands inside of the interpreter to see your results.

    This is also a common pattern when looking at large datasets. One command will pull the data into your python environment, then you can use the interpreter window to look at your data, rather than creating a file with multiple lines of code to run every time you want to see the elements of the data set.

Machine Learning with Map and Filter


    You might still be wondering, if for loops can end up with the same result as map and filter, why use map and filter?

    Imagine that you're working on a large program, as machine learning programs can often be. You work on one small part of the program, and other people are writing code that depends on the results that you come up with.

    The below code is very simple, in that it returns 3 numbers out of 5.

    number_list = [2, 5, 8, 10, 14]
    new_number_list = []
    
    for number in number_list:
        if (number > 7):
            new_number_list.append(number)
    
    print (new_number_list)
    
    # Output:
    # [8, 10, 14]
                    

    However, machine learning programs often need to interpret Gigabytes worth of data, not just five numbers. Even if it takes a short time to perform each calculation individually, if you have millions or billions of numbers it may take hours to come up with a result.

    What happens if the other code tries to get the new_number_list information before your code is finished calculating? You would give the other code an incomplete list. So you'd need to create ANOTHER list that was just for the other code to see the final result.

    number_list = [2, 5, 8, 10, 14]
    new_internal_number_list = []
    
    for number in number_list:
        if (number > 7):
            new_internal_number_list.append(number)
    
    new_external_number_list = new_internal_number_list
    		
    print (new_external_number_list)
    
    # Output:
    # [8, 10, 14]
                    

    But with map and filter, there are no intermediary steps to save data between the original data and the final result. You go straight from the original number list to the filtered list.

    number_list = [2, 5, 8, 10, 14]
    				
    filtered_list = list(filter(lambda number: number > 7, number_list))
    
    print(filtered_list)
    
    # Output:
    # [8, 10, 14]
                    

    This means that your data is always consistent. Rather than having variables that are only partially correct most of the time, you have variables that represent either all of the results, or none of the results.

    When you reach the lessons in this course on performing data analysis from data sets, you will be using some modules that incorporate filter, map, and lambda concepts, but don't explicitly use those terms. The syntax will be using these concepts, even if you don't see the words map, filter, and lambda.

    With the knowledge you know now, you should be able to understand what the code is doing on a lower level.