sábado, 4 de noviembre de 2017

Machine learning















Machine learning, is a great and powerful tool to generate mathematical algorithms applied to specific problems..

As a computer programmer your job is to write the rules that tell a computer exactly how to solve a specific problem. Machine learning is a different approach. Machine learning is where the computer itself learns the rules to solve a problem without being explicitly programmed. Let's start with an example we are all familiar with, junk email. Imagine you are writing a program to filter out junk email from your inbox using traditional programming. First, you'd have to write a complicated program that contains all the rules to decide if a particular email message is junk or a real message. For example, the program might look for certain keywords that you think would only appear in junk email, or you might have the program check if the sender of the email was someone you emailed before.

Next, you try out the program by feeding in some test emails. Finally, you check the results of the program to see if it correctly separated real emails from junk emails. In this process, the hardest part is figuring out which rules help identify an email as junk email or real email. It will take a lot of trial and error to come up with the right rules that accurately identify junk email without any false positives. Even worse, when the spammers change their tactics and start sending junk emails that are designed to get around your rules you have to go in and update your program again to catch them. It's going to take a lot of ongoing maintenance to make this work.
It would be much nicer if the computer could come up with it's own logic for filtering emails, and that's what we can do with machine learning.Here's how the machine learning solution would work. First, we'd gather thousands of emails and sort them into two groups. One group are the emails that we know are real. The other groups are known spam emails. Next, we feed these emails into a machine learning algorithm.The machine learning algorithm is an off the shelf system. We don't have to write any custom code to make it work. The machine learning algorithm will look at the two groups of emails and create it's own rules for how to tell them apart. This process is called training.
We are giving the machine learning algorithm input data the original emails and the expected output, whether each email should be classified as real or spam and it creates it's own rules for how to re-create the output from the input data. The more data it sees during training the better chance it has of learning how to do this accurately. Once the model is trained we can now use it to sort emails that it has never seen before. When we show an unknown email it will apply the rules it learned during training to correctly classify the email as real or spam. With machine learning we didn't have to do the hard part ourselves. We didn't have to write any email filtering rules.



The computer came up with those on it's own based on the training data it saw. The really cool part about machine learning is that the same algorithm that we used to classify email can be use to solve lots of other kinds of problems just by changing the data we feed into it. We don't have to change a single line of code. For example, instead of feeding in emails and marking them as spam or not spam we could just as easily feed them pictures of hand written numbers. The algorithm could decide which number each picture represents, whether it's a zero or a one or in this case an eight. The same algorithm that does email filtering can be used to do handwriting recognition.

With traditional programming you give the computer exact instructions on how to solve a problem. The computer can only do exactly what it has been previously programmed to do. With machine learning it's different. The computer learns how to do new things without you having to explicitly program it. Instead, you show the computer data and the computer learns from the data how to approximate functions that you would have had to program in by hand. Machine learning is a great solution for many complex real world problems that are hard to solvewith traditional programming.
As a computer programmer your job is to write the rules that tell a computer exactly how to solve a specific problem. Machine learning is a different approach. Machine learning is where the computer itself learns the rules to solve a problem without being explicitly programmed. Let's start with an example we are all familiar with, junk email. Imagine you are writing a program to filter out junk email from your inbox using traditional programming. First, you'd have to write a complicated program that contains all the rules to decide if a particular email message is junk or a real message. For example, the program might look for certain keywords that you think would only appear in junk email, or you might have the program check if the sender of the email was someone you emailed before.
Next, you try out the program by feeding in some test emails. Finally, you check the results of the program to see if it correctly separated real emails from junk emails. In this process, the hardest part is figuring out which rules help identify an email as junk email or real email. It will take a lot of trial and error to come up with the right rules that accurately identify junk email without any false positives. Even worse, when the spammers change their tactics and start sending junk emails that are designed to get around your rules you have to go in and update your program again to catch them. It's going to take a lot of ongoing maintenance to make this work.
It would be much nicer if the computer could come up with it's own logic for filtering emails, and that's what we can do with machine learning.Here's how the machine learning solution would work. First, we'd gather thousands of emails and sort them into two groups. One group are the emails that we know are real. The other groups are known spam emails. Next, we feed these emails into a machine learning algorithm.The machine learning algorithm is an off the shelf system. We don't have to write any custom code to make it work. The machine learning algorithm will look at the two groups of emails and create it's own rules for how to tell them apart. This process is called training.
We are giving the machine learning algorithm input data the original emails and the expected output, whether each email should be classified as real or spam and it creates it's own rules for how to re-create the output from the input data. The more data it sees during training the better chance it has of learning how to do this accurately. Once the model is trained we can now use it to sort emails that it has never seen before. When we show an unknown email it will apply the rules it learned during training to correctly classify the email as real or spam. With machine learning we didn't have to do the hard part ourselves. We didn't have to write any email filtering rules.
The computer came up with those on it's own based on the training data it saw. The really cool part about machine learning is that the same algorithm that we used to classify email can be use to solve lots of other kinds of problems just by changing the data we feed into it. We don't have to change a single line of code. For example, instead of feeding in emails and marking them as spam or not spam we could just as easily feed them pictures of hand written numbers. The algorithm could decide which number each picture represents, whether it's a zero or a one or in this case an eight. The same algorithm that does email filtering can be used to do handwriting recognition.
With traditional programming you give the computer exact instructions on how to solve a problem. The computer can only do exactly what it has been previously programmed to do. With machine learning it's different. The computer learns how to do new things without you having to explicitly program it. Instead, you show the computer data and the computer learns from the data how to approximate functions that you would have had to program in by hand. Machine learning is a great solution for many complex real world problems that are hard to solvewith traditional programming.
- [Instructor] Before we dive into more complicated machine learning algorithms let's build the simplest program possible to estimate the value of a house based on its attributes. Open up simple_value_estimator.py. Here we have a function called estimate_home_value. The goal of this function is to estimate the price of a house based on its attributes. This function takes in two attributes to describe a house the size of the house in square feet and the number of bedrooms in the house. At the end of the function it returns the predicted value for the house. To predict the value for the house all we have to do is decide how much the size of the house and the number of bedrooms affects the final value of the house.
Let's start by assuming that any house no matter how small is worth at least $50,000 so we can start with $50,000 as the initial value estimate. Next, we have to decide how much the size of the house plays into the final value. I'm going to guess that every square foot is worth $92 so we can say that the value is now the current value plus the size in square feet times 92. Next, let's look at the bedrooms. It seems reasonable to assume that houses with more bedrooms are worth more than houses with fewer bedrooms.
For every bedroom I'm going to add, say, $10,000 in additional value. We'll say the value is now the value plus the number of bedrooms times 10,000. Finally, let's call this function by trying it on the real house we know costs $450,000. Down here we'll pass in 3800 square feetand five bedrooms. We can run this file by right clicking and choosing Run. It's worth pointing out there's also a keyboard shortcut for running a file but it might be different on your system. In my case, it's Control + Shift + F10.
It's also worth pointing out that if you click Run at the top here it runs the last file not necessarily the current file you're looking at so it's easier just to right click and choose run. Down here it will open the console and show us the output of our program. Our program predicted the house is worth $449,000. That's really close to the $450,000 value we expected. Our estimator is working pretty well. In this example, all we did was take each input value and multiply them by a fixed weight. The weight for size in square feet was 92 and the weight for number of bedrooms was 10,000.
In other words we're saying that the real value of the house is some combination of its size and the number of bedrooms it has. The weights tell us how much each of those factors in to the final calculation. The process of modeling the value of something with a set of fixed weightsis called linear regression. It's one of the simplest machine learning algorithms but with machine learning the computer comes up with the weights by itself by looking up the training data. In the next video we'll learn how the computer can find the best weights on its own.

online tool for write and compiling python
http://pythonfiddle.com/



No hay comentarios:

Publicar un comentario

Blogger Widgets