Machine learning, is a great and powerful tool to generate mathematical algorithms applied to specific problems..
As a computer programmer your job is to write the rules that tell a computer exactly how to solve a
specific problem. Machine
learning is a different approach. Machine learning is where the computer itself learns
the rules to
solve a problem without being explicitly
programmed. Let's start with an example we are all familiar with, junk
email. Imagine
you are writing a program to filter out junk email from your inbox using
traditional programming. First, you'd have to write a complicated program that contains all the rules to decide if a particular
email message is
junk or a real message. For example, the program might look for certain keywords that
you think would
only appear in junk email, or you might have the program check if the sender of the email was someone you emailed
before.
Next, you try out the program by feeding in some test
emails. Finally,
you check the results of the program to see if it correctly
separated real
emails from junk emails. In this process, the hardest part is figuring
out which
rules help identify an email as junk email or real email. It will take a lot of trial
and error to
come up with the right rules that accurately identify junk email without any false positives. Even worse, when the
spammers change their tactics and start sending junk emails that are designed to get
around your rules you
have to go in and
update your program again to catch them. It's going to take a lot of ongoing maintenance to
make this work.
It would be much nicer if the computer could come up with it's own logic for
filtering emails, and
that's what we can do with machine learning.Here's
how the machine learning solution would work. First, we'd gather thousands
of emails and
sort them into two groups. One group are the emails that we know are real. The other groups are known
spam emails. Next,
we feed these emails into a machine learning algorithm.The machine learning algorithm is an off the shelf system. We don't have to write any
custom code to
make it work. The
machine learning algorithm will look at the two groups of emails and create it's own rules for how to tell them apart. This process is called
training.
We are giving the machine learning
algorithm input data the
original emails and
the expected output, whether each email should be classified as real
or spam and
it creates it's own rules for how to re-create the output from the input data. The more data it sees during
training the
better chance it has of learning how to do this accurately. Once the model is trained we can now use it to sort
emails that
it has never seen before. When we show an unknown email it will apply the rules it
learned during training to correctly classify the email as real or spam. With machine learning we didn't have to do the
hard part ourselves. We didn't have to write any email filtering
rules.
The computer came up with those on it's
own based
on the training data it saw. The really cool part about machine learning is that the same algorithm that we used to classify
email can
be use to solve lots
of other kinds of problems just by changing the data we feed into it. We don't have to change a
single line of code. For example, instead of feeding in emails and marking them as spam or
not spam we
could just as easily feed them pictures of hand written numbers. The algorithm could decide which number each picture
represents, whether
it's a zero or a one or in this case an eight. The same algorithm that does email filtering can be used to do
handwriting recognition.
With traditional programming you give the computer exact
instructions on
how to solve a problem. The computer can only do exactly what it has been previously programmed to do. With machine learning it's
different. The
computer learns how to do new things without you having to explicitly program it. Instead, you show the
computer data and
the computer learns from the data how to approximate functions that you would have had to program in by hand. Machine learning is a great solution for many complex real world
problems that
are hard to solvewith traditional
programming.
As a computer programmer your job is to write the rules that tell a computer exactly how to solve a
specific problem. Machine
learning is a different approach. Machine learning is where the computer itself learns
the rules to
solve a problem without being explicitly
programmed. Let's start with an example we are all familiar with, junk
email. Imagine
you are writing a program to filter out junk email from your inbox using
traditional programming. First, you'd have to write a complicated program that contains all the rules to decide if a particular
email message is
junk or a real message. For example, the program might look for certain keywords that
you think would
only appear in junk email, or you might have the program check if the sender of the email was someone you emailed
before.
Next, you try out the program by feeding in some test
emails. Finally,
you check the results of the program to see if it correctly
separated real
emails from junk emails. In this process, the hardest part is figuring
out which
rules help identify an email as junk email or real email. It will take a lot of trial
and error to
come up with the right rules that accurately identify junk email without any false positives. Even worse, when the
spammers change their tactics and start sending junk emails that are designed to get
around your rules you
have to go in and
update your program again to catch them. It's going to take a lot of ongoing maintenance to
make this work.
It would be much nicer if the computer could come up with it's own logic for
filtering emails, and
that's what we can do with machine learning.Here's
how the machine learning solution would work. First, we'd gather thousands
of emails and
sort them into two groups. One group are the emails that we know are real. The other groups are known
spam emails. Next,
we feed these emails into a machine learning algorithm.The machine learning algorithm is an off the shelf system. We don't have to write any
custom code to
make it work. The
machine learning algorithm will look at the two groups of emails and create it's own rules for how to tell them apart. This process is called
training.
We are giving the machine learning
algorithm input data the
original emails and
the expected output, whether each email should be classified as real
or spam and
it creates it's own rules for how to re-create the output from the input data. The more data it sees during
training the
better chance it has of learning how to do this accurately. Once the model is trained we can now use it to sort
emails that
it has never seen before. When we show an unknown email it will apply the rules it
learned during training to correctly classify the email as real or spam. With machine learning we didn't have to do the
hard part ourselves. We didn't have to write any email filtering
rules.
The computer came up with those on it's
own based
on the training data it saw. The really cool part about machine learning is that the same algorithm that we used to classify
email can
be use to solve lots
of other kinds of problems just by changing the data we feed into it. We don't have to change a
single line of code. For example, instead of feeding in emails and marking them as spam or
not spam we
could just as easily feed them pictures of hand written numbers. The algorithm could decide which number each picture
represents, whether
it's a zero or a one or in this case an eight. The same algorithm that does email filtering can be used to do
handwriting recognition.
With traditional programming you give the computer exact
instructions on
how to solve a problem. The computer can only do exactly what it has been previously programmed to do. With machine learning it's
different. The
computer learns how to do new things without you having to explicitly program it. Instead, you show the
computer data and
the computer learns from the data how to approximate functions that you would have had to program in by hand. Machine learning is a great solution for many complex real world
problems that
are hard to solvewith traditional
programming.
- [Instructor] Before we dive into more complicated machine
learning algorithms let's build the simplest program possible to estimate the value of a
house based
on its attributes. Open up simple_value_estimator.py. Here we have a function
called estimate_home_value. The goal of this function is to estimate the price of a house based on its attributes. This function takes in two
attributes to
describe a house the
size of the house in square feet and the number of bedrooms in the house. At the end of the function it returns the predicted
value for the house. To predict the value for the house all we have to do is decide
how much the size of the house and the number of bedrooms affects the final value of
the house.
Let's start by assuming that any house no matter how small is worth at least $50,000 so we can start with $50,000
as the initial value estimate. Next, we have to decide how much the size of the
house plays
into the final value. I'm going to guess that every square foot is worth $92 so we can say that the value
is now the
current value plus
the size in square feet times 92. Next, let's look at the bedrooms. It seems reasonable to assume that houses with more
bedrooms are
worth more than houses with fewer bedrooms.
For every bedroom I'm going to add, say, $10,000
in additional value. We'll say the value is now the value plus the number of
bedrooms times 10,000. Finally, let's call this function by trying it on the real
house we know costs $450,000. Down here we'll pass in 3800 square feetand five bedrooms. We can run this file by right clicking and
choosing Run. It's
worth pointing out there's also a keyboard shortcut for running a file but it might be different on
your system. In
my case, it's Control + Shift + F10.
It's also worth pointing out that if you click Run at
the top here it
runs the last file not necessarily the current
file you're looking at so it's easier just to right click and choose
run. Down
here it will open the console and show us the output of our program. Our program predicted the
house is worth $449,000. That's really close to the $450,000 value we
expected. Our
estimator is working pretty well. In this example, all we did was take each input
value and
multiply them by a fixed weight. The weight for size in square feet was 92 and the weight for number
of bedrooms was 10,000.
In other words we're saying that the real
value of the house is some combination of its size and the number of bedrooms
it has. The
weights tell us how much each of those factors in to
the final calculation. The
process of modeling the value of something with a set of fixed weightsis called linear regression. It's one of the simplest
machine learning algorithms but with machine learning the computer comes up with
the weights by itself by looking up the training data. In the next video we'll learn how the
computer can find the
best weights on its own.
online tool for write and compiling python
http://pythonfiddle.com/
No hay comentarios:
Publicar un comentario