<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Burak Kanber&#039;s Blog</title>
	<atom:link href="http://burakkanber.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://burakkanber.com/blog</link>
	<description>Physics and Machine Learning in Javascript; Engineer and CTO of Tidal Labs</description>
	<lastBuildDate>Fri, 22 Mar 2013 19:36:28 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4</generator>
		<item>
		<title>Machine Learning: Naive Bayes Document Classification Algorithm in Javascript</title>
		<link>http://burakkanber.com/blog/machine-learning-naive-bayes-1/</link>
		<comments>http://burakkanber.com/blog/machine-learning-naive-bayes-1/#comments</comments>
		<pubDate>Wed, 20 Mar 2013 14:38:05 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=372</guid>
		<description><![CDATA[<p>Today we're going to solve a simple problem: language detection. Put another way: "given a piece of text, determine if it's in Spanish, English, or French".</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-naive-bayes-1/">Machine Learning: Naive Bayes Document Classification Algorithm in Javascript</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>This article is part of the <a title="Machine Learning in Javascript: Introduction" href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/" target="_blank">Machine Learning in Javascript</a> series which teaches the essential machine learning algorithms using Javascript for examples. I use Javascript because it&#8217;s well-known and universally supported, making it an excellent language to use for teaching. There’s a mailing list at the bottom of the page if you want to know about new articles; you can also <a href="https://twitter.com/bkanber" target="_blank">follow me on twitter: @bkanber</a>.</p>
<p>Are you just looking for the code example? <a href="http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/#fiddle">Scroll down!</a></p>
<h2>Introduction and Motivation</h2>
<p>Document classification is one of my favorite tasks. &#8220;Document classification&#8221; is exactly what you think it is: given a document and a set of labels, apply the most appropriate label to that document. Labels (or &#8220;classes&#8221; or &#8220;categories&#8221;) can be things like:</p>
<ul>
<li>&#8220;spam&#8221; or &#8220;not spam&#8221; (most mail clients use some form of Bayesian spam detection)</li>
<li>&#8220;written by a male&#8221; or &#8220;written by a female&#8221; (yes, there are clues that can hint to the gender of the author)</li>
<li>&#8220;technology&#8221;, &#8220;politics&#8221;, &#8220;finance&#8221;, &#8220;sports&#8221; (if you need to automatically categorize old articles from a newspaper)</li>
</ul>
<p>Today we&#8217;re going to solve a simple problem: <strong>language detection.</strong> Put another way: &#8220;given a piece of text, determine if it&#8217;s in Spanish, English, or French&#8221;.</p>
<h2>The Very Basics of Natural Language Processing</h2>
<p>First, we&#8217;ll have to change the way we think about &#8220;documents&#8221;. At their most basic level, documents are just collections of arranged characters. Some characters may be letters, others are not. Letters make words, and words have meaning. Sometimes punctuation alters the meaning of words, other times it doesn&#8217;t matter too much! (See what I did there?) We typically study documents by studying the individual words in them. Sometimes, if the situation calls for it, we&#8217;ll look at more than one word at a time (word pairs are called &#8220;bigrams&#8221;), but since this is only &#8220;part 1&#8243; of this Bayes series we&#8217;ll only consider unigrams for now.</p>
<p>Because natural language itself is so quirky and complicated, we generally try to simplify things; we try to reduce the <em>entropy</em> of the system. If you consider both uppercase and lowercase letters, you can create 7.3 million different four-letter words. If you limit yourself to only lowercase letters, that figure drops down to only 450,000. Why is this important? You want your system to use as much data as it can, and treating &#8220;YOU&#8217;RE&#8221; and &#8220;you&#8217;re&#8221; and &#8220;You&#8217;re&#8221; separately only serves to keep what you learn about each in separate buckets. Converting all of those to a simple &#8220;you&#8217;re&#8221; is best, because it&#8217;ll allow you to study the <em>concept</em> of the word &#8220;you&#8217;re&#8221; rather than the various syntaxes of the word.</p>
<p>Of course, more advanced data scientists will recognize that sometimes you <em>do </em>want these quirks in your system. When detecting spam emails, for instance, it turns out that &#8220;offer&#8221; and &#8220;OFFER&#8221; are two very different things. Often, spam detection algorithms will <em>not </em>normalize based on case because of this. Similarly, &#8220;money back&#8221; and &#8220;money back!!!!&#8221; have two very different meanings, so spam detection algorithms will generally also leave the punctuation intact.</p>
<p>You may also decide that you also want the meanings of the words &#8220;imagine&#8221;, &#8220;imagination&#8221;, and &#8220;imaginary&#8221; to be lumped together. To do so, you might chop off the ending of each, converting them to &#8220;imagin&#8221; &#8212; it&#8217;s neither verb nor noun, neither singular nor plural, it&#8217;s simply a concept. This is called &#8220;stemming&#8221;, and it serves to treat different words with the same meaning as the same entity. These entropy-reduction techniques are very important in machine learning as a whole, since training sets for learning algorithms are usually limited.</p>
<p>The process of splitting a document up into discrete chunks that you can study is called &#8220;tokenization&#8221;. For now, we&#8217;ll keep our tokenization simple: we&#8217;ll remove any punctuation, make everything lowercase, and split the document up by spaces to get our tokens.</p>
<h2>Bayes&#8217; Theorem</h2>
<p>I&#8217;m not going to talk about the math around <a href="http://en.wikipedia.org/wiki/Bayes'_theorem" target="_blank">Bayes&#8217; theorem</a>, because interested readers can easily learn the basics of probability and Bayes&#8217; theorem on their own. Instead, we&#8217;ll aim to develop an intuition about this important tool in probability.</p>
<p>We&#8217;re trying to figure out which language a previously unseen document is in. We have a stack of pre-labeled documents in English, French, and Spanish. Since we decided above that we can learn about a document by inspecting the individual words in that document, let&#8217;s start there.</p>
<p>If you look at a single word in a document, you can easily figure out how many times it appeared in your training data. Using that information, you can determine the probability that a certain language will use a given word. For example: &#8220;vous&#8221; is the first word in your document of unknown origin. &#8220;Vous&#8221; may show up in all of your French documents (100%), no Spanish documents, and a small number of English documents (5%). That&#8217;s a great hint! The document is French!</p>
<p>But no, we can&#8217;t stop there. Just because &#8220;vous&#8221; is clearly a French word doesn&#8217;t mean the document itself is French. It may be an English novel quoting a French character. So we can&#8217;t just simply look at the &#8220;probability that &#8216;vous&#8217; is French&#8221;, which is what we just tried. Instead, we need to determine the &#8220;probability that this document is French given that the word &#8216;vous&#8217; is in it&#8221;. Fortunately, Bayes&#8217; theorem does exactly that. If we apply Bayes&#8217; theorem to the fake numbers I gave above, we find that there&#8217;s a 98% chance that a document is French if &#8220;vous&#8221; appears in it. The formula to calculate that is quite simple; see <a href="http://en.wikipedia.org/wiki/Bayes'_theorem" target="_blank">Bayes&#8217; theorem</a>.</p>
<p>Of course, if the next phrase in the document is &#8220;said Jean Pierre, the French museum curator&#8221;, we know there&#8217;s a much smaller chance of the document being French. So we look at each word in turn and calculate the &#8220;probability that the document is (French|English|Spanish) given that this word is in it&#8221;. We combine those individual probabilities and end up with an overall &#8220;probability that this document is French [given that all these words are in it]&#8220;. If that probability is high enough, you can act upon it.</p>
<h2>Why it Works</h2>
<p>The Naive Bayes Classifier is so named because it assumes that each word in the document has nothing to do with the next word. That&#8217;s a naive assumption. But it turns out that, while naive, it&#8217;s actually a great simplifying assumption; studying words separately like this actually yields very good results. You could also study bigrams or trigrams (sets of two or three words at a time), at which point the classifier is no longer &#8220;naive&#8221; but it&#8217;ll require a much larger amount of training data and storage space.</p>
<p>The reason the NB classifier works well for document classification is that it de-correlates the number of times a word is seen in a given language from its statistical importance. The word &#8220;a&#8221; is found in many languages. Perhaps it even appears in 100% of your English training set. But that doesn&#8217;t mean that documents that have it are English. We use Bayes to convert the &#8220;probability that &#8216;a&#8217; appears in an English document&#8221; (which is 100%) to the &#8220;probability that this document is English because it has &#8216;a&#8217; in it&#8221; (maybe 50%).</p>
<p>Therefore, the common stuff that&#8217;s found everywhere is given a very weak significance and the stuff that&#8217;s found more uniquely across a category is given a much stronger weight. The end result is a very smart, simple algorithm that has low error rates (&#8220;low&#8221; being a relative term). It&#8217;s not magic, there&#8217;s no neural network, there&#8217;s no &#8220;intelligence&#8221;, it&#8217;s just math and probability.</p>
<h2>The Code: Training</h2>
<p>As usual, let&#8217;s just dive in. The first thing our document classifier needs to be able to do is train itself given a piece of text and a label for that text. Since I decided to build this classifier as generalized as possible, we won&#8217;t hard code the labels or even put a cap on the number of labels allowed. (Though naive Bayes classifiers with only two possible labels, like spam/ham, <em>are</em> a little bit easier to build.)</p>
<pre class="code">Bayes.train = function (text, label) {
    registerLabel(label);
    var words = tokenize(text);
    var length = words.length;
    for (var i = 0; i < length; i++)
        incrementStem(words[i], label);
    incrementDocCount(label);
};</pre>
<p>The <code>registerLabel</code> function simply adds the label to the "database" (in this case, localStorage) so that we can retrieve a list of labels later. </p>
<p>The <code>tokenize</code> function in this case is very simple. We'll look at more interesting tokenization techniques in another article (tokenization can be an important part of these classifiers), but this one is straight-forward:</p>
<pre class="code">var tokenize = function (text) {
    text = text.toLowerCase().replace(/\W/g, ' ').replace(/\s+/g, ' ').trim().split(' ').unique();
    return text;
};</pre>
<p>In this case, we use <code>unique()</code> because we're only interested in <em>whether</em> a word shows up in a document, and not the number of times it shows up. In certain situations, you may get better results by considering the number of times a word appears in a document. </p>
<p>We then loop through each word (or "token"), and call <code>incrementStem</code> on it. This function is also very simple: it just records the number of times a word was seen for a given label. </p>
<p>Finally, we call <code>incrementDocCount</code>, which records how many documents we've seen for a given label.</p>
<p>The end result of training is that we have a database that stores each label we've ever seen (in our example, it'll hold "english", "spanish", and "french"), stores the number of times a word has been seen for a label (eg, "le" was seen in French documents 30 times), and stores the total number of documents for each label (eg, we saw 40 French documents).</p>
<h2>The Code: Guessing</h2>
<p>Training a naive Bayes classifier is dead simple and really fast, as demonstrated above. Guessing a label given a document is a little tougher, but writing the algorithm is easy to those who understand probability. If you don't understand probability, that's ok; you can spend some time reading up on naive Bayes classifiers and you'll always have this example to come back to and study.</p>
<p>The first thing we'll do in our guessing function (other than initializing variables; see the JSFiddle for the minutiae) is a little bit of bookkeeping:</p>
<pre class="code">for (var j = 0; j < labels.length; j++) {
    var label = labels[j];
    docCounts[label] = docCount(label);
    docInverseCounts[label] = docInverseCount(label);
    totalDocCount += parseInt(docCounts[label]);
}</pre>
<p>Our goal here is to set ourselves up for calculating certain probabilities later. To do this, we need to know the number of documents we've seen for a given label (docCounts), but we also need to know the number of documents <em>not</em> in that label (docInverseCounts). Finally, we need to know the total number of documents we've ever seen. </p>
<p>You could flip this function upside down and get docInverseCount simply by subtracting a label's docCount from the totalDocCount -- in fact, that approach is better and faster, but I did it with an explicit docInverseCount function because it reads a little easier. </p>
<p>Given the above information we can determine, for example, the probability that any arbitrary document would be French. We also know the probability that any document is NOT French. </p>
<pre class="code">for (var j = 0; j < labels.length; j++) {
    var label = labels[j];
    var logSum = 0;
    ...</pre>
<p>Next, we look at each label. We set up a <code>logSum</code> variable, which will store the probability that the document is in this label's category. </p>
<pre class="code">for (var i = 0; i < length; i++) {
    var word = words[i];
    var _stemTotalCount = stemTotalCount(word);
    if (_stemTotalCount === 0) {
        continue;
    } else {
        var wordProbability = stemLabelCount(word, label) / docCounts[label];
        var wordInverseProbability = stemInverseLabelCount(word, label) / docInverseCounts[label];
        var wordicity = wordProbability / (wordProbability + wordInverseProbability);

        wordicity = ( (1 * 0.5) + (_stemTotalCount * wordicity) ) / ( 1 + _stemTotalCount );
        if (wordicity === 0)
            wordicity = 0.01;
        else if (wordicity === 1)
            wordicity = 0.99;
   }

    logSum += (Math.log(1 - wordicity) - Math.log(wordicity));
}
scores[label] = 1 / ( 1 + Math.exp(logSum) );</pre>
<p>The above is the meat of the algorithm. For each label we're considering, we look at each word in the document. The _stemTotalCount variable holds the number of times we've seen that word in <em>any</em> document during training. If we've never seen this word before, just skip it! We don't have any information on it, so why use it?</p>
<p><code>wordProbability</code> represents the "probability that this word shows up in a [French|English|Spanish] document". If you've seen 40 French documents, and 30 of them have the word "le" in them, this value is 0.75. <code>wordInverseProbability</code> is the probability that the word shows up in any other category than the one we're considering. </p>
<p>The funny <code>wordicity</code> variable is what happens when you apply Bayes' theorem to the two probabilities above. While the wordProbability variable represents "the probability that [le] shows up in a [French] document", the wordicity variable represents "the probability that this document is [French] given that [le] is in it". The distinction is subtle but very important. If you're having trouble understanding the distinction at this point, I strongly recommend saying those two phrases out loud and making sure you understand the difference before moving on.</p>
<p>The wordicity line above also makes the assumption that English, French, and Spanish documents are all equally common and starting off on the same footing. This assumption makes the calculation a little simpler, but you can consider the <em>a priori</em> probabilities of each language if you'd like. A note on that later.</p>
<p>The line below the wordicity definition is an optional <em>adjustment</em> for words that we've only seen in training a few times. If you've only seen a word once, for instance, you don't really have enough information about that word to determine if it's really French or Spanish. So we make a weighted adjustment: we bring the wordicity closer to 50% if we haven't seen it too many times. The "0.5" in that equation is the value we should try to adjust towards, and the "1"s in the equation are the weight -- if you increase this value, the wordicity will remain close to 0.5 longer. If you have a large training set, you can make the weight 5 or 10 or 20 or 50 (depending on how big your training set is). Since we have a very small training set, I made this value 1 but realistically I should have just omitted the line completely (my training set is only 15 paragraphs). I just wanted to show you that adjusting for rare words is something that you can do.</p>
<p>Below that, we avoid letting wordicity be either 0 or 1 since we're about to use a log function on our data, and either of those values would kind of mess up the results. </p>
<p>The logSum line isn't really a part of the mathematical equations, but is rather a practical consideration. After calculating the wordicity for each word, we need to combine those probabilities somehow. The normal mathematical way to do that would be to multiply each probability together and divide by the multiplication of all the inverses. Unfortunately, floating point math isn't perfect and you can run into "floating point underflow", where the number gets too small for floating point math to deal with. So instead, we take a log of the numerator and denominator (combined probabilities and their inverses), and add up the logs. </p>
<p>Finally, after we've combined all the individual word probabilities with the logSum line, we undo the log function we just used to get the probability back in the 0 to 1 range. Note that this happens outside of the "look at each word" loop but still inside the "look at each label" loop.</p>
<h2>Important Assumptions and Caveats</h2>
<p>One thing I'd like to point out is that the above makes a big simplifying assumption: we've assumed that English, French, and Spanish documents are all <em>equally likely</em> to appear. In our example, this is a good assumption since you guys are probably going to test one of each later, but in the real world this isn't necessarily true. </p>
<p>The Bayes classification algorithm does actually let you consider the <em>a priori</em> probability of a document's language (meaning, the probability that a document is English just based on the number of English documents out there, before considering the actual contents of the document), but I've simply left this out. It's not too hard to put in; adding just a few more terms to the wordicity calculation can do this. The next Bayes article I write will use the full form of Bayes theorem.</p>
<p>Finally, please note that I was <em>really lazy</em> while training this algorithm. You can see from the JSFiddle that I've only used 5 paragraphs from each language to train the thing on. That's not nearly enough to go by, as the words seen during training are the only words it knows. I've found that this example <em>does</em> work well if you type in sentences or paragraphs (try just copy/pasting stuff from news sites), but simple nouns and phrases probably won't work. For example, you'll get the wrong result for "la tortuga" ("the turtle" in Spanish) simply because we never showed it the word "tortuga" before. The algorithm will guess French in this case because it's seen slightly more "la"s in French than it's seen in Spanish. A larger training set would fix this issue.</p>
<h2>Results</h2>
<p>We're basically done. All we have to do now is either report all the labels' probabilities or just pluck out the highest one. Try pasting some English, French, or Spanish news text in the JSFiddle below -- you should see the guessed language and the probability that led us to guess that language. This example works better with sentences or paragraphs; the more words you give it to guess by, the better the chance that it has seen one of those words in its limited training set.</p>
<p>So far, I've had 100% accuracy when copying and pasting sentences from news sites. Try it below and see for yourself!</p>
<p>If you like this article, or thought it was helpful, please consider:</p>
<ul>
<li>Sharing it with friends on Facebook and Twitter</li>
<li>Discussing it on Hacker News and Reddit</li>
<li>Discussing it in the comments section below</li>
<li><a href="//twitter.com/bkanber" target="_blank">Following me on Twitter @bkanber</a></li>
<li><a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/">Reading my other ML in JS articles</a></li>
<li>Signing up for the ML in JS mailing list below &mdash; I only send emails about new articles, never spam, won't ever sell your info</li>
</ul>
<p><iframe id="fiddle" width="100%" height="300" src="http://jsfiddle.net/bkanber/gNrdF/embedded/result,js,html,css/" allowfullscreen="allowfullscreen" frameborder="0"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=3f5451bda5" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new ML in JS articles are posted</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: Would you buy an ML in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6613][1]" id="mce-group[6613]-6613-0"><label for="mce-group[6613]-6613-0">I would pay $10 for a DRM-free e-book with tons of ML lessons and JS examples.</label>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-naive-bayes-1/">Machine Learning: Naive Bayes Document Classification Algorithm in Javascript</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-naive-bayes-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>I was just tricked into learning another language</title>
		<link>http://burakkanber.com/blog/i-was-just-tricked-into-learning-another-language/</link>
		<comments>http://burakkanber.com/blog/i-was-just-tricked-into-learning-another-language/#comments</comments>
		<pubDate>Tue, 29 Jan 2013 04:27:24 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=357</guid>
		<description><![CDATA[<p>This is a follow-up to my "Effective teaching is a long-con" article, except in the context of web apps rather than classrooms. In the previous article I basically made the argument that you need to trick people into learning -- and that's how the best teachers do it. Now I'm writing because I recently had an experience where I, myself, was tricked (by a website) into learning.</p><p>The post <a href="http://burakkanber.com/blog/i-was-just-tricked-into-learning-another-language/">I was just tricked into learning another language</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>This is a follow-up to my &#8220;<a title="Effective Teaching is a Long Con" href="http://burakkanber.com/blog/effective-teaching-is-a-long-con/">Effective teaching is a long-con</a>&#8221; article, except in the context of web apps rather than classrooms. In the previous article I basically made the argument that you need to trick (some) people into learning &#8212; and that&#8217;s how the best teachers do it. Now I&#8217;m writing because I recently had an experience where I, myself, was tricked (by a website) into learning.</p>
<p>I just discovered <a title="Duolingo" href="http://www.duolingo.com" target="_blank">Duolingo</a>. If you&#8217;re not familiar, Duolingo is a language learning web app. You sign up and decide to learn Spanish (or something else), and you&#8217;re taken through a series of lessons that combine flash cards, translation, speaking, and listening. It&#8217;s basically what Rosetta Stone would have evolved into if they didn&#8217;t have so much culture inertia.</p>
<h2>Tricked into Learning</h2>
<p>Here&#8217;s what I noticed about Duolingo, and how it managed to suck me in:</p>
<ol>
<li><strong>It&#8217;s well-designed</strong> and up-to-date looking. Just think about how Rosetta Stone looks, and then look at Duolingo.</li>
<li><strong>It&#8217;s so very easy to use</strong>. They spent an extra few hours building in keyboard shortcuts. I love that because I&#8217;m a keyboard-shortcut kinda guy. I hate moving my hands over to the mouse or trackpad.</li>
<li><strong>It uses a gamification structure</strong>. <em>But not just badges</em>. It uses a points system and a skill tree, just like every RPG game I&#8217;ve ever played and loved (Diablo 2, Oblivion, etc). You don&#8217;t just add dumb badges to your profile; instead you progress through a skill tree <em>as you develop actual skills</em>. This is seriously awesome when you think about it; this web app&#8217;s skills tree is actually representative of your real life skills!</li>
<li><strong>It&#8217;s free</strong>. Obviously learning another language is valuable. But I wouldn&#8217;t go spend money to do it because, well, I never really felt like spending money to learn a language. But Duolingo found another (clever) way to monetize &#8212; you help them translate real world text as you learn! What a great idea, and it takes away a huge barrier to entry. I tried it for free, liked it, and then I stuck around. Low barrier to entry, high stickiness factor.</li>
<li><strong>It&#8217;s well-executed all-around. </strong>Everything about it smooth and easy, and it&#8217;s actually fun. It just works. It does exactly what you expect it to.</li>
</ol>
<p>Here&#8217;s my point: I had no idea I wanted to learn Spanish (and German) until I played with this. It&#8217;s fun and easy and the app&#8217;s value made itself immediately apparent.</p>
<p>I was literally tricked into learning Spanish.</p>
<h2>Comparison to Codecademy</h2>
<p>Hopefully other players in the education space will learn from Duolingo, but it&#8217;s also hard to tell if this will work for other fields of study. Codecademy is pretty good at this kind of thing too, but there&#8217;s something holding it back. Everybody already knows at least one spoken language, so everybody that uses Duolingo is familiar with the core concepts of communication. Codecademy&#8217;s userbase is programming novices, however, and many of those core concepts (logic, program flow, functions) are foreign to new students. It&#8217;s harder for someone to just dive in and stick around because the learning curve is steeper. Codecademy does a good job of shallowing the curve, but it&#8217;s still steep.</p>
<p>This isn&#8217;t Codecademy&#8217;s fault. They&#8217;re excellent. But it&#8217;s obvious that there are fundamental differences in the course material that makes &#8220;gamified learning&#8221; a little tougher with programming than with language.</p>
<h2>Comparison to Coursera</h2>
<p>I&#8217;d also like to compare Duolingo to coursera. I&#8217;m sure we can all agree that coursera is amazing and wonderful all around, but their model has a limitation of its own: coursera is great at teaching people who <em>want to learn</em> (and have the discipline to do so), but that doesn&#8217;t describe the general public. Coursera makes smart, driven people smarter. Duolingo, on the other hand, can take <em>anyone </em>and teach them a second language.</p>
<p>The obvious best-case scenario would be to somehow combine coursera&#8217;s depth and breadth with the ease of use and stickiness of Duolingo. That doesn&#8217;t seem possible. It probably isn&#8217;t. But why the heck not? What is it about education that forces us to use so many different teaching and learning methods at so many different points in our learning journey?</p>
<p>I&#8217;m waiting for some player to come around and flip education on its head. Coursera et al are all amazing, but that head-flipping aspect isn&#8217;t there. Duolingo is, in my mind, the best jump we&#8217;ve seen in language learning since Rosetta Stone came out, but their techniques aren&#8217;t applicable globally. I wonder what education innovation will come next. I wonder if it&#8217;ll ever come.</p>
<p>And the moral of this story, if you&#8217;re a teacher: make sure you trick your students into learning.</p>
<p>&nbsp;</p>
<p>PS: I have no affiliation with Duolingo. I just really love what they&#8217;re doing and appreciate their approach. If you want to learn a language, please try them. I&#8217;d like to see them stick around for a while.</p>
<p>PPS: I have nothing against Rosetta Stone.</p>
<p>&nbsp;</p>
<p>The post <a href="http://burakkanber.com/blog/i-was-just-tricked-into-learning-another-language/">I was just tricked into learning another language</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/i-was-just-tricked-into-learning-another-language/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Big data doesn&#8217;t need to be so big.</title>
		<link>http://burakkanber.com/blog/big-data-doesnt-need-to-be-so-big/</link>
		<comments>http://burakkanber.com/blog/big-data-doesnt-need-to-be-so-big/#comments</comments>
		<pubDate>Wed, 07 Nov 2012 00:46:33 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=344</guid>
		<description><![CDATA[<p>There are lots of people talking about big data these days. There's a lot of discussion about how to build apps for "web scale", and there's an emphasis on real time apps that collect comprehensive data.</p><p>The post <a href="http://burakkanber.com/blog/big-data-doesnt-need-to-be-so-big/">Big data doesn&#8217;t need to be so big.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>There are lots of people talking about big data these days. There&#8217;s a lot of discussion about how to build apps for &#8220;web scale&#8221;, and there&#8217;s an emphasis on real time apps that collect comprehensive data.</p>
<p>This article is me playing devil&#8217;s advocate.</p>
<p>I think the big data hype is hurting our ability to think critically about our problems. Obviously this isn&#8217;t the case everywhere. Many engineers are excellent and brilliant and do the right thing. But many others simply brute-force their problems into submission.</p>
<p>These days it&#8217;s possible to cheaply collect and analyze billions of data points on commodity resources. On the surface that seems like a good thing, but we should keep in mind that many of the problems we solve today were solvable yesterday too. We went to the moon on a 2MHz processor and 4KB of RAM.</p>
<p>Think about what a feat that was! Think about the problem solving skills that took!</p>
<p>If we could that, why can&#8217;t we just get by on a million data points instead of a billion? Have we already forgotten statistics? When was the last time you actually asked yourself if you need big data? Unless you actually need to record every single data point &#8212; if that&#8217;s your core value proposition &#8212; I&#8217;m sure you can get by on a random sample.</p>
<p>I&#8217;d be willing to bet that any data source that can provide a billion data points will still be statistically significant if you were to sample only 1% or 0.1% of the source.</p>
<p>Big data doesn&#8217;t need to be so big. Even Google analytics uses sampling for datasets larger than 100,000.</p>
<p>My suggestion to you, then, is to really evaluate your options before you spin up 50 new EC2 instances. If you&#8217;re a startup and funds are slim, or if your data population is very dense, then brush up on your statistics and figure out if you can get away with sampled or aggregate data. Please don&#8217;t think that you can avoid doing smarter work just because servers are getting cheap.</p>
<p>Like I said, this is a devil&#8217;s advocate post. It&#8217;s meant as food for thought. If you really do need all that data, then ignore this. I won&#8217;t defend this article to the death. But if you&#8217;ve never thought about this sort of thing, it&#8217;s time to start. Don&#8217;t get caught up in the hype of big data, and just do what&#8217;s best for the project!</p>
<p>The post <a href="http://burakkanber.com/blog/big-data-doesnt-need-to-be-so-big/">Big data doesn&#8217;t need to be so big.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/big-data-doesnt-need-to-be-so-big/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do we have a moral obligation to society?</title>
		<link>http://burakkanber.com/blog/do-we-have-a-moral-obligation-to-society/</link>
		<comments>http://burakkanber.com/blog/do-we-have-a-moral-obligation-to-society/#comments</comments>
		<pubDate>Sat, 03 Nov 2012 04:04:35 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[updateme]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=329</guid>
		<description><![CDATA[<p>I just spent 48 hours without electricity. I'm not complaining; I could have had it much worse (many people in Staten Island did) and very fortunately nobody I know was hurt or lost their home. But my experience without power got me thinking.</p><p>The post <a href="http://burakkanber.com/blog/do-we-have-a-moral-obligation-to-society/">Do we have a moral obligation to society?</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>I just spent 48 hours without electricity. I&#8217;m not complaining; I could have had it much worse (many people in Staten Island did) and very fortunately nobody I know was hurt or lost their home. But my experience without power got me thinking.</p>
<p>I was sitting around in the dark thinking about whether ConEd would give us a pro-rated discount for the 48 hours we went without power. (I don&#8217;t care if they do or not, I was just curious about the mundane.) Then I realized that 48 hours without power amounts to about $6 on my bill.</p>
<p>Six dollars? That&#8217;s it? If I could have paid a premium to keep my power on during Hurricane Sandy I gladly would have paid maybe $50 or even $100 just for that $6 worth of electricity. Isn&#8217;t that amazing? Electricity is so valuable to us and yet so cheap. And also so inaccessible to the undeveloped world.</p>
<p>Of course, I wouldn&#8217;t be able to pay a sustained $1,500 a month for electricity, but the fact that I&#8217;d shell out $100 to keep my power on during a storm made me realize just how valuable it is. And that&#8217;s not an easy feat. I used to be a hybrid car engineer, and I worked on several renewable energy projects in my grad school years. I always felt I had a grounded, solid understanding of how valuable electricity is.</p>
<p>But actually living without it gave me a new perspective. It made me think about places where power isn&#8217;t a given, like my grandparents&#8217; house in the mountains in Turkey. Power comes and goes there and it&#8217;s just a way of life. But if that were to happen here it wouldn&#8217;t be acceptable.</p>
<p>Most people like to think that they&#8217;re empathetic to poorer nations&#8217; way of life. But are we really? I always thought I maintained the appropriate level of gratefulness and empathy, but after just 48 hours with no power I started to go crazy. It was demoralizing, sitting in the dark, cold and helpless and cut off.</p>
<p>And now I have an existential crisis. I used to be a mechanical engineer. I have my master&#8217;s degree in hybrid car design and I&#8217;ve also done quite a bit of work in renewable energy. And I left that behind to do what? Software engineering. I make websites and write code for the first world. I turned my back on my altruistic goals of helping the environment and designing clean tech because &#8220;it wasn&#8217;t for me&#8221; and &#8220;I wanted to be my own boss&#8221;.</p>
<p>Is that fair? Is that a decision I&#8217;m morally allowed to make? I have the skills to help other people out but instead I&#8217;m running a startup and writing on my blog. Should I feel guilty? Do I have a moral obligation to use my engineering skills to give back to the world in a bigger way? I don&#8217;t know. I enjoy software more&#8211;which is why I switched. I have more experience in software than mechanical engineering, but that may be a moot point: having a couple of degrees and being published in hybrid vehicles engineering may have made me more valuable to the sustainable engineering world than the I am to the web startup world.</p>
<p>And if I&#8217;m more valuable to the sustainable world &#8212; if they <em>need</em> me more &#8212; than the startup world, am I <em>supposed</em> to be doing sustainable engineering with my life? Do I get to just recklessly choose which I pursue? Or am I supposed to go the way that needs me more?</p>
<p>Do what you love, or do what you&#8217;re good at?</p>
<p>I think I&#8217;ll stick to what I love for now (software and startups), but if anyone wants to buy my company for a few hundred million (it&#8217;s worth it!) then I promise I&#8217;ll follow in Elon Musk&#8217;s footsteps and focus on the rest of the world.</p>
<p>The post <a href="http://burakkanber.com/blog/do-we-have-a-moral-obligation-to-society/">Do we have a moral obligation to society?</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/do-we-have-a-moral-obligation-to-society/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>SiteChat: a postmortem. Or, the rise and fall of a society.</title>
		<link>http://burakkanber.com/blog/sitechat-a-postmortem-or-the-rise-and-fall-of-a-society/</link>
		<comments>http://burakkanber.com/blog/sitechat-a-postmortem-or-the-rise-and-fall-of-a-society/#comments</comments>
		<pubDate>Sat, 20 Oct 2012 21:19:20 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=321</guid>
		<description><![CDATA[<p>About a year ago I built a Chrome extension called SiteChat -- the premise was simple: turn every website into a chatroom. The app was an instant success, and over the following year I watched entire societies emerge and die off inside the bizarre ecosystem that I had created.</p><p>The post <a href="http://burakkanber.com/blog/sitechat-a-postmortem-or-the-rise-and-fall-of-a-society/">SiteChat: a postmortem. Or, the rise and fall of a society.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>About a year ago I built a Chrome extension called <a target="_blank" href="https://chrome.google.com/webstore/detail/sitechat/affmlbhoebcjponkmlmoeinojjcggbnk">SiteChat</a> &#8212; the premise was simple: turn every website into a chatroom. The app was an instant success, and over the following year I watched entire societies emerge and die off inside the bizarre ecosystem that I had created.</p>
<h2 dir="ltr">Genesis and Technical Stuff</h2>
<p>I got bored one day and decided to learn how to build Chrome extensions. I also wanted to play with the nascent WebSockets protocol, so I figured I’d combine those two pieces into some kind of experiment.</p>
<p>As context, I’ll mention that my tech friends often make fun of me because I write software entirely from scratch. Something about using a black box third party library makes me uneasy. And what’s the point of “learning” WebSockets if you’re not actually going to implement the protocol yourself? Otherwise, you’re just learning somebody else’s WebSockets library. My friends call it “overkill”; I call it learning!</p>
<p>So I wrote a WebSockets server daemon in PHP and threw together a Chrome extension called SiteChat. Pure laziness dictated that the app wouldn’t require registration; all messages were passed through the WS server without being stored. That turned out to be very helpful; SiteChat was now essentially anonymous and registration-free. Chatters could enter a nickname, but since there was no registration that nickname could be used by anybody else in the system&#8230; which eventually lead to some amusing trolling in the system.</p>
<p>Version 1 of SiteChat took a day and a half to develop and release in the Chrome store. I posted a link to it on Hacker News and Reddit, took a nap or ate dinner or something, and woke up to thousands of new users.</p>
<p>Over the course of the next few days something like 100,000 people tried SiteChat. The night I released it, I had to scramble and write a new protocol on top of WebSockets to handle server bridging. I was committed to the anonymous, push-only aspect of SiteChat by this point, so bringing up a DB server as a message store was not possible&#8211;even though that would have easily solved my problem of quickly needing to bring up multiple WebSocket servers. Regardless, hacking together a bridging protocol wasn’t too bad and I was able to keep SiteChat up in the face of tens of thousands of concurrent users all with permanently open TCP connections to my servers.</p>
<p>I’m actually very happy about the way my rushed, experimental, hacked-together-in-hours PHP WebSockets stack worked out. While I only have the one WS server running today, the system proved to be reliable and maintenance-free and has been going strong for over a year now. I think there was a period of downtime for a few days at some point, but that was because I switched DNS providers and forgot to update the SiteChat DNS entry. Oops! But the WS stuff never had any issues.</p>
<h2 dir="ltr">Critical Reception</h2>
<p>When I posted SiteChat on HN, I got a whole lot of “this will never stick, somebody tries it every year and it always fails.” I expected that to be true, but I didn’t really care either way. SiteChat was a side project that I had no emotional or financial investment in, so if it failed, whatever! I learned Chrome extensions and WebSockets, and it was fun to work on.</p>
<p>However, I became really excited when I saw the user stats. People were using this thing! The reddit.com, ycombinator.com, facebook.com, and gmail.com chatrooms were the most popular, with hundreds of people in each room and dozens of active chatters on at all hours of the day.</p>
<p>Over the next few weeks I remained very involved with SiteChat. I was online all day, chatting away. I also worked on some nice new features, my favorite being the <a href="http://sitechatapp.com/chat/" target="_blank">Pop-Out/Mobile version of the app</a>. Lots of FireFox users complained that they couldn’t use SiteChat, and lots of people wanted to chat in a room while browsing other sites, so I made a standalone interface to SiteChat. It also (still) works on iOS and Android browsers, which is really cool.</p>
<p>(I’m getting excited just talking about SiteChat again. I really loved this app!)</p>
<p>But far more interesting than the technology was the community that formed. Each of the major chatrooms(/domains) on SiteChat had its own unique ecosystem. Reddit was for casual chatting. Hacker News was for techies. Facebook had people looking for cybersex. Long, deep conversations at a (generally) high level of discourse were held, and that lasted for a little over two months.</p>
<h2 dir="ltr">Real Life and Mythologies</h2>
<p>After a few weeks of coddling SiteChat, real life struck again and I no longer had hours a day free to chat online. So I stopped engaging with the app for a month(?&#8230; the timeline is fuzzy), but the little chat notification numbers in my browser told me that SiteChat was still alive and kicking.</p>
<p>A month or two went by and I’d all but forgotten about the app. One day I went back on, and it felt a little sad. While not quite a ghost town, SiteChat did resemble Detroit a few years back. It clearly had once been a great, booming society, but was also clearly on its way out. The regulars I used to know were no longer there&#8211;though there was a group of new regulars.</p>
<p>I hopped onto the reddit.com chatroom and said hi. Immediately, people asked me why my name was green! (I had the only reserved nickname, “bkanber” in green lettering). Even that was a bit of a shock to me. Everybody had known me in the past. Nobody knew me now, except for one guy who piped up “he’s the creator of SiteChat”. And then what happened was really strange.</p>
<p>It became clear to me that there was a lot of mythology floating around the SiteChat community. It seems that in the two-month hiatus I took, roughly three generations of users had gone by; just enough for stories to be handed down and distorted a bit. Some people thought I was some kind of rebel in an oppressed country, and I built SiteChat in order to exercise free speech (I’m from New York). Some people thought I was a basement-dwelling version of the comic book guy from The Simpsons (I’m social, and have decent genetics). Some people outright revered me. It was clear that SiteChat had alleviated social pain-points for a lot of people that lived in rural areas, and they thanked me profusely.</p>
<p>I realized at that point that online chatrooms weren’t really popular anymore like they used to be in the past with AOL and IRC. But there was still a small market for it, and SiteChat filled that niche temporarily.</p>
<p>I then learned something very intriguing. SiteChat didn’t have moderators of any kind, except for a simple rate-limiting system that prevented excessive abuse. While I never saw trolling become a huge problem, it was there. I then learned that, in my absence, a group of users formed a SiteChat vigilante group called something like “the Templar” or “the Agents” or something (I forget), with the sole purpose of bullying trolls into leaving SiteChat rooms. Now that’s weird.</p>
<p>Given an open and free ecosystem, a group of users spontaneously decided to come together and protect their habitat. They defended the weak and innocent against bullies. Now, I think we all know that given an open ecosystem, bullies always appear. But other than some isolated “white-knighting”, I never expected the ecosystem to regulate itself in such a manner. That really impressed me, and left me speechless for some time.</p>
<h2 dir="ltr">Extinction of a Society</h2>
<p>I did end up disengaging myself from SiteChat again. I knew that SiteChat would eventually disappear, but in the interim the community seemed like it could take care of itself without any intervention from me.</p>
<p>A month or two later, SiteChat usage was essentially down to zero and remained there for several weeks. Then the second wind came. About 10,000 new users joined up again, stabilized to 5,000 active users, and there was some good activity for a few months. I discovered that these users were primarily young people (13 and 14 year olds) whose parents didn’t let them IM or use Facebook yet. SiteChat, freely available from the Chrome store, was their loophole. The level of discourse on SiteChat was at an all-time low. I quickly disengaged again, and haven’t been back since.</p>
<p>Today the Chrome store shows 5,000 SiteChat users, but with only a hundred messages sent each day the once booming society is essentially extinct.</p>
<p>Could I revive it with a marketing campaign? Maybe. Will I? Probably not. Like Hacker News said from the very beginning, somebody tries this every year or so and fails. In fact, the reason I’m writing this is because I recently met with a young startup that wrote a very similar app. Will they have their moment in the sun? Maybe. I don’t know.</p>
<p>I think what made SiteChat compelling for a while was its low barrier to entry. Thousands of people clicked on the Chrome store link when I first posted it, and a very high percentage of people converted. Installing a Chrome extension is really easy, and most users installed it and immediately saw people chatting. They were able to start chatting as “Anon 5032” or something as soon as they installed it. If they were interested in chatting more, it took just a second to type in a nickname. You didn’t have to put in an email address, or click a confirmation link, or create a password.</p>
<p>SiteChat relies on the network effect, and so it falls victim to the chicken-and-egg problem. You wouldn’t use SiteChat if no one else is using it, and if no one else is using it, it would attract no new users. The low barrier to entry was critical. Having, say, 1000 users online in the first hour created a society that lasted (in significant form) for roughly six months. Had only 250 users been online in the first hour, I can guarantee that SiteChat would have been dead in six days. So in this instance, that high conversion rate in the first hour was of the utmost importance. If you were to launch SiteChat in parallel universes, the “number of users in the first hour vs longevity” chart would certainly be exponential.</p>
<p>I had another major advantage: I didn’t care about monetizing SiteChat. I can’t tell you how many people asked me to turn it into a Live Support application (that supports file uploads and webcam!). I also received a weak offer to purchase SiteChat, which I didn’t pursue. Had I tried to monetize SiteChat, I would have killed it. Using advertisements, requiring registration, or bloating it with features all could have made it monetizable, but would have immediately destroyed the userbase.</p>
<p>Fortunately, SiteChat only cost me about $200 over the past 14 months. And that price tag was well worth the experience of watching an app hit its stride, watching an entire society emerge, and subsequently experiencing the extinction of an entire species of SiteChat users.</p>
<p>The post <a href="http://burakkanber.com/blog/sitechat-a-postmortem-or-the-rise-and-fall-of-a-society/">SiteChat: a postmortem. Or, the rise and fall of a society.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/sitechat-a-postmortem-or-the-rise-and-fall-of-a-society/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Machine Learning: k-Means Clustering Algorithm in Javascript</title>
		<link>http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/</link>
		<comments>http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/#comments</comments>
		<pubDate>Mon, 15 Oct 2012 03:42:04 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=294</guid>
		<description><![CDATA[<p>Today we're going to figure out how to find clusters of data points. Let's say you work at a medical imaging devices company. Imagine you already have a way to identify malignant cells from an image scan, but it would be great to automatically identify the centers of clusters of cells as well. Then a robot could go in with surgical precision and remove the problem!</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/">Machine Learning: k-Means Clustering Algorithm in Javascript</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>This article is part of the <a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/" title="Machine Learning in Javascript: Introduction" target="_blank">Machine Learning in Javascript</a> series. The series covers some of the essential machine learning algorithms and assumes little background knowledge. There&#8217;s also a mailing list at the bottom of the page if you want to know about new articles; you can also <a href="https://twitter.com/bkanber" target="_blank">follow me on twitter: @bkanber</a>. </p>
<p>Are you just looking for the code example? <a href="#fiddle">Scroll down!</a></p>
<h2>Introduction and Motivation</h2>
<p>Machine learning helps us navigate and process large volumes of data. We can ask all sorts of questions about our data, and hope that ML can answer them: what is this data point most similar to? Does the data come in patterns? Can we predict what will happen in the future, given past trends? These questions are applicable to all fields of study.</p>
<p>Today we&#8217;re going to figure out how to find clusters of data points. Let&#8217;s say you work at a medical imaging devices company. Imagine you already have a way to identify malignant cells from an image scan, but it would be great to automatically identify the centers of clusters of cells as well. Then a robot could go in with surgical precision and remove the problem!</p>
<p>What we&#8217;re looking for is a clustering algorithm; today we&#8217;re going to talk specifically about the k-means algorithm.</p>
<h2>Clustering</h2>
<p>Clustering algorithms, in general, find groups of similar pieces of data. If you run an online store you might use a clustering algorithm to identify different shopper types. You may find that you have one type of visitor that just window shops through 3-5 pages of products and leaves. Another group might make meticulous purchasing decisions by looking through 15 pages of products and reviews and end up making only one, high-value purchase. And you may also identify the impulse buyer, who makes numerous small purchases without browsing too deeply. Once you&#8217;ve identified your e-shopper demographics, you&#8217;re better able to optimize your site to increase sales. You can release features that appeal to your impulse buyers, because now you know that you <em>have </em>impulse buyers!</p>
<p>And while that&#8217;s just one practical example of k-means, you&#8217;ll find this algorithm used in multiple fields. Sometimes it&#8217;s just image processing in 2 dimensions, other times it&#8217;s processing huge data across dozens of dimensions and parameters. Like our k-nearest-neighbor algorithm, k-means is versatile, simple to understand and implement, and sneakily powerful.</p>
<h2>k-means</h2>
<p>Like k-nearest-neighbor, the &#8220;k&#8221; in k-means gives away that there&#8217;s going to be some number that we&#8217;re going to have to feed to our algorithm. Specifically, &#8220;k&#8221; is the number of clusters we&#8217;re going to find in our data. Unfortunately, it&#8217;s rarely possible to know the number of clusters before you solve the problem, so k-means is usually supplemented by another algorithm that first helps you find the best value of <em>k</em>.</p>
<p>The issue is this: the k-means algorithm will partition your data into &#8220;k&#8221; distinct clusters, but it does not tell you if that&#8217;s the <em>correct </em>number of clusters. Your data might naturally have 5 different clusters in it, but if you feed k-means the number 3 you&#8217;ll get 3 clusters back. Those clusters will be bigger, looser and more awkwardly shaped than if you had told it to find 5 clusters.</p>
<p>The long and short of it is this: in order to use k-means you either need to know how many clusters you&#8217;re looking for at the outset, <em>or </em>you have to use a second algorithm to also guess the number of clusters. K-means just organizes your points into clusters; you need to do something else to figure out the right number of clusters.</p>
<p>For today we&#8217;ll contrive a situation and use three clusters from the outset. Next time (in k-means part 2) we&#8217;ll look at a technique you can use to automatically guess the value of &#8220;k&#8221;. Most often, these algorithms rely on some kind of error analysis and multiple passes of the k-means algorithm in order to optimize for the solution with the smallest error value.</p>
<h2>The Procedure</h2>
<p>The k-means algorithm is simple but can become very powerful if you&#8217;re using it on a dataset with many dimensions. Today we&#8217;re going to work in 2 dimensions, but next time we&#8217;ll do something more complicated. Here&#8217;s what the algorithm looks like:</p>
<ol>
<li>Plot your data points</li>
<li>Create &#8220;k&#8221; <em>additional</em> points, placing them randomly on your graph. These points are the &#8220;cluster centroids&#8221; &#8212; or the candidates for the centers of your clusters.</li>
<li>Repeat the following:
<ol>
<li>&#8220;Assign&#8221; each data point to the cluster centroid closest to it</li>
<li>Move the centroid to the average position of all the data points that belong to it</li>
<li>If any of the centroids moved in the last step, repeat. If nothing moved, exit.</li>
</ol>
</li>
</ol>
<div>It&#8217;s that simple! As you can see, this is an iterative process. It may take 2 or 3 or dozens of iterations, but eventually your cluster centroids should converge to their solutions and stop moving. You then take the final tally of the assignments and then you have your clusters.</div>
<h2>Committee of Machines</h2>
<p>This algorithm, like many of the one we&#8217;ll play with in this series, is susceptible to local optima. If you run the example below a few times you&#8217;ll see that the clusters can end up in one of a few different configurations. These are various local optima that the solution gets stuck in. Algorithms that start with some sort of random seed state (like GAs or k-means) are particularly susceptible to local optima, because you never really know how the algorithm will start off and which path the solution will end up following. Will this seed state lead to a local or global optima? There&#8217;s no way of knowing!</p>
<p>Like in genetic algorithms, one way to shake out of local optima is to give the solution a little bit of mutation. In our k-means example, we could add a rule that gives a centroid a nudge in a random direction if it doesn&#8217;t move after an iteration. It might settle back into its last resting place, or it may find a new solution. The nudge shouldn&#8217;t be big enough that it restarts the solution from the beginning, but just enough to kick a centroid out of a local valley, if it&#8217;s in one.</p>
<p>Another technique we could use is called the &#8220;committee of machines&#8221;, which works well if you&#8217;re running an algorithm that finishes pretty quickly, or if you have parallel computing capabilities. It&#8217;s simple: we run the k-means algorithm 3 or 5 or 51 or 10,000 times, and choose the solution that it returned the most often. The term &#8220;committee of machines&#8221; alludes to the fact that some people choose to actually run parallel algorithms on different pieces of hardware, and a literal committee of machines votes on the solution.</p>
<h2>The Code</h2>
<p>Let&#8217;s dive in. Unlike my other examples thus far, I&#8217;m going to forego an object-oriented implementation and just go straight procedural. There are many ways to skin a cat. I love OOP but it&#8217;s important not to get too comfortable in habits!</p>
<p>Additionally, while we&#8217;re only working with 2 dimensional data in this example, I&#8217;d like to write this algorithm out to handle any number of dimensions (except for the canvas drawing functions).</p>
<p>Let&#8217;s take a look at the data we&#8217;re using &#8212; a simple array of &#8220;points&#8221;, which are just represented by 2-element arrays (for X and Y values):</p>
<pre class="code">
var data = [
	[1, 2],
	[2, 1],
	[2, 4], 
	[1, 3],
	[2, 2],
	[3, 1],
	[1, 1],

	[7, 3],
	[8, 2],
	[6, 4],
	[7, 4],
	[8, 1],
	[9, 2],

	[10, 8],
	[9, 10],
	[7, 8],
	[7, 9],
	[8, 11],
	[9, 9],
];
</pre>
<p>Next, we define two functions that are helpful to us, but not essential. Given a list of points, I&#8217;d like to know what the max and min values for each dimension are, and what the range of each dimension is. I want to know &#8220;X ranges from 1 to 11, and Y ranges from 3 to 7&#8243;. Knowing these figures helps us draw the graph on the canvas, and also helps when we initialize our random cluster centers (we&#8217;d like them to be within the range of the data points when we start them out).</p>
<p>Keeping in mind that we&#8217;re writing this to be generic with regards to the number of dimensions they can handle: </p>
<pre class="code">
function getDataRanges(extremes) {
	var ranges = [];

	for (var dimension in extremes)
	{
		ranges[dimension] = extremes[dimension].max - extremes[dimension].min;
	}

	return ranges;

}

function getDataExtremes(points) {
	
	var extremes = [];

	for (var i in data)
	{
		var point = data[i];

		for (var dimension in point)
		{
			if ( ! extremes[dimension] )
			{
				extremes[dimension] = {min: 1000, max: 0};
			}

			if (point[dimension] < extremes[dimension].min)
			{
				extremes[dimension].min = point[dimension];
			}

			if (point[dimension] > extremes[dimension].max)
			{
				extremes[dimension].max = point[dimension];
			}
		}
	}

	return extremes;
}
</pre>
<p>The <code>getDataExtremes()</code> method loops through all the points and each dimension in each point and finds the min and max values (note there&#8217;s a hard-coded &#8220;1000&#8243; in there, which you should change if you&#8217;re using large numbers). The <code>getDataRanges()</code> function is just a helper that takes that output and returns the range of each dimension (the maximum value minus the minimum value). </p>
<p>Next up, we define a function that initializes <em>k</em> random cluster centroids:</p>
<pre class="code">
function initMeans(k) {

	if ( ! k )
	{
		k = 3;
	}

	while (k--)
	{
		var mean = [];

		for (var dimension in dataExtremes)
		{
			mean[dimension] = dataExtremes[dimension].min + ( Math.random() * dataRange[dimension] );
		}

		means.push(mean);
	}

	return means;

};
</pre>
<p>We&#8217;re just creating new points with random coordinates within the range and dimensions of our dataset. </p>
<p>Once we have our randomly seeded centroids, we need to enter our k-means loop. As a reminder, the loop consists of first assigning all our data points to the centroid closest to it, then moving the centroids to the average position of all the data points assigned to it. We repeat that until the centroids stop moving. </p>
<pre class="code">
function makeAssignments() {

	for (var i in data)
	{
		var point = data[i];
		var distances = [];

		for (var j in means)
		{
			var mean = means[j];
			var sum = 0;

			for (var dimension in point)
			{
				var difference = point[dimension] - mean[dimension];
				difference *= difference;
				sum += difference;
			}

			distances[j] = Math.sqrt(sum);
		}

		assignments[i] = distances.indexOf( Math.min.apply(null, distances) );
	}

}
</pre>
<p>The above function is called by our &#8220;loop&#8221; function and calculates the <a href="http://en.wikipedia.org/wiki/Euclidean_distance" target="_blank">Euclidean distance</a> between each point and the cluster center. </p>
<p>Note that the above algorithm loops through each point and then loops through each cluster centroid, making this an O(k*n) algorithm. It&#8217;s not terrible, but it might be computationally intensive if you have a large number of data points or a large number of clusters or both. There are ways you can optimize this, which we&#8217;ll perhaps discuss in a future article. For one, we can try to eliminate the expensive <code>Math.sqrt()</code> call; we could also try not to iterate through every point. </p>
<p>Once we have our list of assignments &#8212; in this case, just an associative array of <code>point index => center index</code> &#8212; we can go ahead and update the positions of the means (the cluster centers).</p>
<pre class="code">
function moveMeans() {

	makeAssignments();

	var sums = Array( means.length );
	var counts = Array( means.length );
	var moved = false;

	for (var j in means)
	{
		counts[j] = 0;
		sums[j] = Array( means[j].length );
		for (var dimension in means[j])
		{
			sums[j][dimension] = 0;
		}
	}

	for (var point_index in assignments)
	{
		var mean_index = assignments[point_index];
		var point = data[point_index];
		var mean = means[mean_index];

		counts[mean_index]++;

		for (var dimension in mean)
		{
			sums[mean_index][dimension] += point[dimension];
		}
	}

	for (var mean_index in sums)
	{
		console.log(counts[mean_index]);
		if ( 0 === counts[mean_index] ) 
		{
			sums[mean_index] = means[mean_index];
			console.log("Mean with no points");
			console.log(sums[mean_index]);

			for (var dimension in dataExtremes)
			{
				sums[mean_index][dimension] = dataExtremes[dimension].min + ( Math.random() * dataRange[dimension] );
			}
			continue;
		}

		for (var dimension in sums[mean_index])
		{
			sums[mean_index][dimension] /= counts[mean_index];
		}
	}

	if (means.toString() !== sums.toString())
	{
		moved = true;
	}

	means = sums;

	return moved;

}
</pre>
<p>The <code>moveMeans()</code> starts by calling the <code>makeAssignments()</code> function. Once we have our assignments, we initialize two arrays: one called &#8220;sums&#8221; and the other called &#8220;counts&#8221;. Since we&#8217;re calculating the arithmetic mean (or average), we&#8217;ll need to know the sum of points&#8217; dimensions as well as the number of points whose dimensions we&#8217;re averaging. </p>
<p>We then hit three loops: </p>
<p>First we loop through our means and prepare our <em>sums</em> and <em>counts</em> arrays. Our <em>sums</em> array will actually be multidimensional, because we&#8217;re storing each dimension&#8217;s sum of each point of each mean in this structure &#8212; so we have to zero-out the second-depth level of this 2 dimensional array. </p>
<p>Then we loop through our assignments and increment the <em>counts</em> counter for each cluster center we have points assigned to, and additionally loop through the point&#8217;s dimensions to fill in the <em>sums</em> array. At this point we have all the data we need to calculate the new positions of the cluster centers.</p>
<p>The final loop loops through our results, calculates the mean position for each cluster center, and moves it. The final loop also checks to see if a cluster center had <em>no</em> points assigned to it. If it didn&#8217;t have any points assigned to it, we give it a new random position. This is just us trying to kick that cluster center back into the solution.</p>
<p>Finally, we wrap up this function by checking to see if any one of our cluster centers has moved &#8212; and we return either true or false.</p>
<p>To get this algorithm started, we run the following setup functions:</p>
<pre class="code">
function setup() {

	canvas = document.getElementById('canvas');
	ctx = canvas.getContext('2d');

	dataExtremes = getDataExtremes(data);
	dataRange = getDataRanges(dataExtremes);
	means = initMeans(3);

	makeAssignments();
	draw();

	setTimeout(run, drawDelay);
}

function run() {

	var moved = moveMeans();
	draw();

	if (moved)
	{
		setTimeout(run, drawDelay);
	}

}
</pre>
<p><code>setup()</code> initializes everything we need, and then our <code>run()</code> function checks to see if the algorithm has stopped, and loops based on a timer so that we can watch the algorithm do its work in a reasonable timeframe. </p>
<h2>k-medians</h2>
<p>One major issue with the k-means algorithm isn&#8217;t a fault of the algorithm&#8217;s, but rather of the concept of the arithmetic mean, or average. The average is a pretty bad metric when you have outlying data. </p>
<p>If you work at a company where 5 people make $50,000 a year but one person makes $1,000,000, the <em>median</em> salary is $50,000 (very representative of salary at that company), but the <em>mean</em> salary is $200,000 (not at all representative of salary at that company)!</p>
<p>This happens with all sorts of data, and can and will happen in the k-means algorithm too. If you have a dataset prone to outliers, you&#8217;ll find that k-means gets &#8220;stuck&#8221; on the outlier and ends up yielding poor results. In that case, switch to k-medians! The algorithm is nearly the same; instead of calculating the mean for your cluster centers, use the median instead. I believe &#8212; but I&#8217;m not certain &#8212; that calculating the median also has a performance advantage over the mean. </p>
<h2>Results</h2>
<p>As you can see from the example below, k-means works very well for our nice, neat data. Obviously it&#8217;ll have more difficulty with messy data, like any other algorithm.</p>
<p>If you run the example below a number of times (click the play button on the JSFiddle) you&#8217;ll eventually see it fall into a local optimum. This should also demonstrate the usefulness of the &#8220;committee of machines&#8221; solving method: while a bad solution does pop up from time to time, it should be clear that a committee of machines will produce the correct solution reliably.</p>
<p>Finally, if you like this series, please sign up for the mailing list below and tell your friends! I also appreciate discussion, so feel free to use the commenting tool below. And make sure to check out the other <a href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/" title="Modeling Physics in Javascript: Introduction" target="_blank">ML in JS articles</a>!</p>
<p><iframe id="fiddle" style="width: 100%; height: 500px; border:1px solid #ccc;" src="http://jsfiddle.net/bkanber/8NpNp/embedded/result,js,html,css/" allowfullscreen="allowfullscreen" frameborder="0"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=3f5451bda5" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new ML in JS articles are posted</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: Would you buy an ML in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6613][1]" id="mce-group[6613]-6613-0"><label for="mce-group[6613]-6613-0">I would pay $10 for a DRM-free e-book with tons of ML lessons and JS examples.</label>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/">Machine Learning: k-Means Clustering Algorithm in Javascript</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Physics in Javascript: Rigid Bodies &#8212; Part 1 (Pendulum Clock)</title>
		<link>http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/</link>
		<comments>http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/#comments</comments>
		<pubDate>Wed, 03 Oct 2012 05:00:32 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Modeling Physics]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=261</guid>
		<description><![CDATA[<p>So far, we’ve modeled everything in this series as particles. Particles are the simplest way to model something physically, because they only have a location in space and don't have a size or shape. Particles can only translate (move), and can’t rotate.

And while you can do some amazing things just by modeling particles, particles aren't realistic in the real world. Real objects aren't just infinitely small dots. Real objects have a size and shape and most importantly, an orientation. Real objects can rotate...</p><p>The post <a href="http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/">Physics in Javascript: Rigid Bodies &#8212; Part 1 (Pendulum Clock)</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em>This article is part of the <a title="Modeling Physics in Javascript: Introduction" href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/" target="_blank">Physics in Javascript</a> series. If you feel lost reading this article, you may want to read the <a title="Modeling Physics in Javascript: Gravity and Drag" href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/" target="_blank">Gravity and Drag</a> and <a title="Physics in Javascript: Car Suspension – Part 1 (Spring, Mass, Damper)" href="http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/" target="_blank">Spring-Mass-Damper</a> articles. </em></p>
<p>Just looking for the <a href="#jsfiddle">cool JSFiddle code example</a>?</p>
<p>So far, we’ve modeled everything in this series as particles. Particles are the simplest way to model something physically, because they only have a location in space and don&#8217;t have a size or shape. Particles can only translate (move), and can’t rotate.</p>
<p>And while you can do some amazing things just by modeling particles, particles aren&#8217;t realistic in the real world. Real objects aren&#8217;t just infinitely small dots. Real objects have a size and shape and most importantly, an orientation. Real objects can rotate. And since most of you are probably interested in game development: you can&#8217;t get by on particles alone.</p>
<p>Today we&#8217;re going to begin our discussion of rigid bodies. Rigid bodies have all of the properties of particles: they have a location, they have a linear velocity, and they have a linear acceleration. But because rigid bodies have a shape and size, they also have an orientation (a rotation angle), and are subject not just to forces but also to torques which cause them to rotate. Torques can either be applied to an object directly (in the case of a wheel on a motor, like in a car), or they can appear because a force is applied to an object off-center (like when a football is kicked and tumbles end-over-end).</p>
<p>While rigid bodies are more complicated to model than particles, we’re still able to use familiar concepts and Newton’s laws. We just need to extend them a little bit to deal with rotation and torque.</p>
<h2 dir="ltr">Translation and Rotation</h2>
<p>Let&#8217;s start by comparing linear motion (translation) to angular motion (rotation). Translation has properties that we’re pretty familiar with by now: position, velocity, acceleration, mass, and force. Rotation is similar, but the variables have been modified slightly to: angle, angular velocity, angular acceleration, moment of inertia, and torque. Let’s introduce these concepts very briefly.</p>
<h3 dir="ltr">Angle</h3>
<p>Analogous to “position” for translation, the angle describes the current orientation of a rigid body. If you’re working in two dimensions &#8212; as we are here &#8212; you only need one angle to fully describe an object’s rotation. Typically, angle is denoted by the Greek letter theta: θ. In three dimensions, three angles are required to describe the object’s orientation, and in that case you should use <a href="http://en.wikipedia.org/wiki/Euler_angles" target="_blank">Euler angles</a>. Angle is a dimensionless quantity typically measured in radians or degrees. This is in contrast to the “position” in linear motion, whose dimension is “length”.</p>
<p>Concretely: the angle θ is the number of radians or degrees something is rotated by.</p>
<h3 dir="ltr">Angular Velocity and Angular Acceleration</h3>
<p>Like linear velocity, angular velocity describes how much the object’s angle changes in a given time-frame. While “degrees per hour” is an acceptable unit, angular velocity is almost always given in “radians per second”. Since radians are dimensionless, angular velocity has a dimension of inverse time. Typically, the Greek letter omega is used for angular velocity: ω.</p>
<p>Similarly, angular acceleration is given in “radians per second squared” or “radians per second, per second”, and is typically denoted by the Greek letter alpha: α.</p>
<h3 dir="ltr">Torque</h3>
<p>Force, in the linear world, is directly responsible for an object’s acceleration. Rotation is no different; in the angular world, torque is responsible for an object&#8217;s angular acceleration.</p>
<p>What’s more, a torque is just a force applied in a special manner: torque is what you get when you apply a force slightly offset from an object’s center of mass. If you were to lay a block on a table and push on it dead-center with your finger, you’d push the block forward without rotating it. If, however, your finger were off to the side of the block, you’d end up both pushing and rotating the block. It’s still the same block, it’s still the same finger. The only thing that has changed is where you applied the force. Since you applied it in such a way that it didn’t go “through” the block’s center of mass, you created a torque and that’s what caused the block to rotate.</p>
<p>A torque, therefore, is just a force multiplied by its (perpendicular) distance from the object’s center of mass (or point of rotation, depending on your physical set-up). Think about using a wrench: the longer the wrench is, the easier it is to turn. That’s because you’re multiplying the force by a larger distance and getting a larger torque as a result (of course, you have to move your hand farther to get the same angle of motion with a longer wrench &#8212; so you end up doing the same amount of work!).</p>
<p>Torque is measured in units of force times distance: newton-meters or pound-feet, and is typically denoted by the Greek letter tau: τ.</p>
<h3 dir="ltr">Moment of Inertia</h3>
<p>The toughest piece of the rotational puzzle to calculate from scratch is the moment of inertia. In linear motion, mass (inertia) is a sort of resistance to changing the object’s motion. A light object is easier to speed up or slow down than a heavy object. Similarly, moment of inertia is a resistance to a change in rotation. But moment of inertia is more complicated than mass, because both the object’s mass and its shape and size need to be considered. A 5 pound hoop is harder to get spinning than a 5 pound disc of the same size, because the hoop has all its mass near the edge.</p>
<p>The moment of inertia is given in units of kg * m<sup>2</sup> (kilogram times meters squared) and is typically denoted by the capital letter I or J. I like J.</p>
<p>Fortunately, you can just <a href="http://en.wikipedia.org/wiki/List_of_moments_of_inertia" target="_blank">look up the moments of inertia of many common shapes</a> and use those in your calculations. In school we were forced to do those calculations by hand with double or triple integrals. If you’re interested in practicing your calculus, go that route. If not, just use the lookup table.</p>
<h3 dir="ltr">Equation of Motion and a Pendulum Clock</h3>
<p>Newton’s 2nd Law is <strong>F<sub>net</sub></strong> = m*<strong>a</strong>. Modified for rotation, we can write it as <strong>τ<sub>net</sub></strong> = J*<strong>α</strong>. Notice that, like the linear form of the equation, both τ and α are vectors. Since our example is in 2D &#8212; which means all our rotation happens around the &#8220;z&#8221; axis &#8212; we can just assume these are scalars for now.</p>
<p>Since you’ve certainly read the <a title="Modeling Physics in Javascript: Gravity and Drag" href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/" target="_blank">Gravity and Drag</a> and <a title="Physics in Javascript: Car Suspension – Part 1 (Spring, Mass, Damper)" href="http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/" target="_blank">Spring-Mass-Damper</a> articles, you probably already know what to do with this equation. Figure out the total torque, use that to calculate the acceleration, and then integrate twice to get the angle. The trickiest part is the &#8220;figure out the net torque&#8221; step, especially if you&#8217;re uncomfortable with trigonometry. Let’s dive into an example.</p>
<blockquote><p>I should now point out that &#8212; just like with linear motion &#8212; you can introduce springs and dampers into the mix. The equations are essentially the same as their linear counterparts. If you want a homework assignment, please try adding a damper to this pendulum!</p></blockquote>
<h2>Pendulum Clock</h2>
<p>I want to build a simple and elegant animated pendulum clock. Rather than just &#8220;fudging&#8221; the motion of the pendulum by using sine easing, I want to solve for the physics directly. (Though physics students will recognize that sine actually <em>is</em> the solution to this problem!)</p>
<p>The pendulum itself will be modeled as a point mass at the end of a weightless rod. Many physics problems are made with such assumptions: point mass, weightless rod, no friction, etc. It&#8217;s a good way to make a first pass at what could become a very complicated problem.</p>
<p>Diving right into some code:</p>
<pre class="code">var pendulum = {mass: 1, length:500, theta: (Math.PI/2) - 0.05, omega: 0, alpha:0, J:0};
var setup = function() {
	pendulum.J = pendulum.mass * pendulum.length * pendulum.length;
...
}</pre>
<p>Right off the bat we set up initial conditions for our pendulum. In this case, we&#8217;ve defined θ = 0 to be when the pendulum is pointing horizontally to the right. Many people would choose to define θ = 0 as downwards. It doesn&#8217;t really matter, as long as you&#8217;re consistent.</p>
<p>Then we calculate the value of the pendulum&#8217;s moment of inertia. Our <a href="http://en.wikipedia.org/wiki/List_of_moments_of_inertia" target="_blank">Wikipedia list of moments of inertia</a> tells us that a &#8220;point mass m at distance r from the axis of rotation&#8221; has a moment of inertia of m*r<sup>2</sup>, so that&#8217;s what we use.</p>
<h2>Velocity Verlet</h2>
<p>I&#8217;ve done a few things differently in this code example. First of all, I&#8217;ve used <code>requestAnimFrame()</code> instead of <code>setInterval()</code> (see the JSFiddle for the full code; I won&#8217;t talk about it here). Secondly, instead of using Euler&#8217;s method of numerical integration, I&#8217;ve used the <a href="http://en.wikipedia.org/wiki/Verlet_integration#Velocity_Verlet" target="_blank">velocity verlet method</a>.</p>
<p>Euler&#8217;s method is actually pretty inaccurate as far as numerical integration methods go. I&#8217;ve used it up until now because it&#8217;s very simple to understand, but it&#8217;s time to move onwards and upwards. You&#8217;re ready for it!</p>
<div class="wp-caption alignnone" style="width: 430px"><a href="http://en.wikipedia.org/w/index.php?title=File:RightRiemann2.svg&amp;page=1"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/45/RightRiemann2.svg/600px-RightRiemann2.svg.png" alt="File:RightRiemann2.svg" width="420" height="336" /></a><p class="wp-caption-text">Source: Wikipedia</p></div>
<p>The above graphic illustrates where the inaccuracy in Euler&#8217;s method comes from. Numerical integration is all about calculating the area below a function&#8217;s curve. Most basic numerical integration techniques draw rectangles underneath the curve and add up their areas (see <a href="http://en.wikipedia.org/wiki/Riemann_sum" target="_blank">Riemann sum</a>); this is what we&#8217;re doing with Euler&#8217;s method. Unfortunately, this can grossly over- or under-estimate a function especially if that function changes quickly. Making those rectangles thinner (in our case, making the loop delay smaller/faster) can help greatly, but there&#8217;s still the fact that we&#8217;re putting the the corner of the rectangle on the curve.</p>
<p>One other integration technique improves accuracy by having the function run through the midpoint of the rectangle, instead of the left or right corner; this is called the <a href="http://en.wikipedia.org/wiki/Midpoint_method" target="_blank">midpoint method</a>. There&#8217;s also the <a href="http://en.wikipedia.org/wiki/Trapezoidal_rule" target="_blank">trapezoidal method</a>, which draws trapezoids instead of rectangles. There&#8217;s also the very accurate <a href="http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods" target="_blank">Runge-Kutta method</a>, which uses a weighted average of 4 points along the function to calculate the area below the curve.</p>
<p>Here&#8217;s a nice demonstration of how much better the midpoint method (in green) works compared to Euler&#8217;s method (blue).</p>
<div class="wp-caption alignnone" style="width: 422px"><a href="http://en.wikipedia.org/wiki/File:Numerical_integration_illustration,_h%3D1.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/b5/Numerical_integration_illustration%2C_h%3D1.png/412px-Numerical_integration_illustration%2C_h%3D1.png" alt="File:Numerical integration illustration, h=1.png" width="412" height="600" /></a><p class="wp-caption-text">Source: Wikipedia</p></div>
<p>To make a long story short, we&#8217;ll use the velocity verlet method today. I encourage you to look up the details and try to understand what gives it an advantage over Euler&#8217;s method!</p>
<h2>The Loop</h2>
<p>Let&#8217;s get back to code. In the loop (boring stuff omitted):</p>
<pre class="code">var loop = function() {
...
	/* Velocity Verlet */
	/* Calculate current position from last frame's position, velocity, and acceleration */
	pendulum.theta += pendulum.omega * deltaT + ( 0.5 * pendulum.alpha * deltaT * deltaT );

	/* Calculate forces from current position. */
	var T = pendulum.mass * 9.81 * Math.cos(pendulum.theta) * pendulum.length;

	/* Current acceleration */
	var alpha = T / pendulum.J;

	/* Calculate current velocity from last frame's velocity and 
		average of last frame's acceleration with this frame's acceleration. */
	pendulum.omega += 0.5 * (alpha + pendulum.alpha) * deltaT;

	/* Update acceleration */
	pendulum.alpha = alpha;

	var px = width/2 + pendulum.length*Math.cos(pendulum.theta);
	var py = 50 + pendulum.length*Math.sin(pendulum.theta);
...
};</pre>
<p>The velocity verlet method has us do things &#8220;out of order&#8221;, so to speak. I&#8217;ll describe the steps in the order I feel is natural, however, starting with torque:</p>
<p>First we calculate the torque on the pendulum. Since gravity pulls downwards at all times, we need to use some trigonometry to get the component of the pendulum&#8217;s weight that&#8217;s pulling the pendulum towards the center &#8212; the small portion of the weight pointed perpendicular to the pendulum itself. Because of the fact that we set up the θ = 0 point on the right horizontal, we get this by calculating m * g * cos(θ) &#8212; (where m is the pendulum&#8217;s mass and g is 9.81 m/s<sup>2</sup>, the acceleration due to gravity on Earth).</p>
<p>Since we&#8217;re looking for a torque, we have to multiply that perpendicular force by its distance from the center.</p>
<p>Once we have the torque the next few steps are easy. We calculate the angular acceleration α, and then we implement a piece of velocity verlet to get the angular velocity ω: we use the average of last frame&#8217;s and this frame&#8217;s acceleration in the calculation.</p>
<p>Finally, we calculate the angle θ. Notice this time I kept the 0.5 * α * Δt<sup>2</sup> in the equation. Up until now I&#8217;ve left that term out &#8212; Δt is pretty small, and squaring it makes it so small it&#8217;s almost negligible. But let&#8217;s leave it in here now just as a demonstration!</p>
<p>And that&#8217;s it for the physics. The next part is a little trigonometry: we have to convert our radius and angle to x- and y-coordinates for the canvas. This is achieved readily with sine and cosine.</p>
<p>As an added bonus, we draw a cute little clock. See the JSFiddle for details.</p>
<h2>Results</h2>
<p>If you have a formal education in physics, you&#8217;ve probably done this very same pendulum problem by hand. You should notice that our numerical solution agrees very well with the analytical solution. For one, the mass of the pendulum has no effect on the pendulum&#8217;s period. Only the length matters! Additionally &#8212; without any damping &#8212; the pendulum always returns to its highest points on either side. Feel free to fork the code below and play around.</p>
<p>If you like these articles please tell your friends, please sign up for the mailing list below, and consider <a href="https://twitter.com/bkanber" target="_blank">following me on Twitter</a>. Readers like you following the blog and subscribing to the mailing list gives me the motivation to write these articles &#8212; and potentially an e-book in the future!</p>
<p>You may also like my <a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/" target="_blank">Machine Learning in Javascript</a> series.</p>
<p><iframe id="jsfiddle" style="width: 100%; height: 700px; border: 1px solid #CCC;" src="http://jsfiddle.net/bkanber/uvYw5/embedded/result,js,css,html/" frameborder="0" width="320" height="240"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<style type="text/css">
	#mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
	#mc_embed_signup form { padding: 0 }
</style>
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=73795f99d0" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new Physics in JS articles are written</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: would you buy a Physics in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6713][1]" id="mce-group[6713]-6713-0"><label for="mce-group[6713]-6713-0">I would pay $10 for a DRM-free e-book with tons of Physics in JS.</label></p>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/">Physics in Javascript: Rigid Bodies &#8212; Part 1 (Pendulum Clock)</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Machine Learning: Genetic Algorithms in Javascript Part 2</title>
		<link>http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/</link>
		<comments>http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/#comments</comments>
		<pubDate>Sat, 15 Sep 2012 13:06:04 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=224</guid>
		<description><![CDATA[<p>Today we're going to revisit the genetic algorithm. 

You're a scientist that has recently been framed for murder by an evil company. Before you flee the lab you have an opportunity to steal 1,000 pounds (or kilograms!) of pure elements from the chemical warehouse; your plan is to later sell them and survive off of the earnings. Given the weight and value of each element, which combination should you take to maximize the total value without exceeding the weight limit?</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/">Machine Learning: Genetic Algorithms in Javascript Part 2</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>Today we&#8217;re going to revisit the genetic algorithm. If you haven&#8217;t read <a title="Machine Learning: Genetic Algorithms Part 1 (Javascript)" href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/">Genetic Algorithms Part 1</a> yet, I strongly recommend reading that now. This article will skip over the fundamental concepts covered in part 1 &#8212; so if you&#8217;re new to genetic algorithms you&#8217;ll definitely want to start there.</p>
<p>Just looking for <a href="#fiddle">the example?</a></p>
<h2>The Problem</h2>
<p>You&#8217;re a scientist that has recently been framed for murder by an evil company. Before you flee the lab you have an opportunity to steal 1,000 pounds (or kilograms!) of pure elements from the chemical warehouse; your plan is to later sell them and survive off of the earnings.</p>
<blockquote><p>Given the weight and value of each element, which combination should you take to maximize the total value without exceeding the weight limit?</p></blockquote>
<p>This is called <a href="http://en.wikipedia.org/wiki/Knapsack_problem" target="_blank">the knapsack problem</a>. The one above is a one-dimensional problem, meaning the only constraint is weight. We could complicate matters by also considering volume, but we need to start somewhere. Note that in our version of the problem only one piece of each element is available, and each piece has a fixed weight. There are some knapsack problems where you can take unlimited platinum or up to 3 pieces of gold or something like that, but here we only have one of each available to us.</p>
<p>Why is this problem tough to solve? We&#8217;ll be using 118 elements. The brute-force approach would require that we test 2<sup>118</sup> or 3.3 * 10<sup>35</sup> different combinations of elements.</p>
<h2>Greedy Algorithm</h2>
<p>A quick benchmark we&#8217;ll use for our solution is called the &#8220;greedy&#8221; solution. The greedy algorithm grabs the most valued items and puts them into the knapsack until it can&#8217;t fit any more.</p>
<p>Sometimes this works great. Sometimes it doesn&#8217;t. Imagine that there&#8217;s a piece of gold in the warehouse that&#8217;s valued at $1,000 but weighs 600 pounds. And there&#8217;s also a piece of cadmium that has a value of $950 but only weighs 300 pounds, and there are a bunch of other elements that have a pretty high value but reasonably light weights. The greedy algorithm still tosses gold in there, and all that precious available weight is taken up by something that&#8217;s not really worth it.</p>
<p>The &#8220;naive&#8221; greedy algorithm for our dataset will give us a total value of $3649 with a total weight of 998 pounds.</p>
<p>You may at this point be thinking &#8220;why don&#8217;t we just figure out value <em>per pound</em> for each element and use that?&#8221; Sure, that works too! It will, in fact, work way better than the above.</p>
<p>Using that approach, the &#8220;weighted&#8221; greedy algorithm gives us a total value of $4901 and a total weight of 969.</p>
<p>So those are our numbers to beat: we should expect to handily beat $3,649, and we&#8217;ll be happy if we also beat $4,901.</p>
<p>Why does the greedy algorithm work well for this type of problem? Because the greedy algorithm is solving for the &#8220;highest value per unit weight&#8221; and that&#8217;s very close to what we want to fill our knapsack with. However, the greedy algorithm will not perform well in instances where there&#8217;s a large range of weights and values. That is, the greedy algorithm will perform better if the range of values (and/or weights) of the elements is between $1 &#8211; $100, but will perform worse if the range is $1 &#8211; $500.</p>
<blockquote><p>Respond in the comments: why would the greedy algorithm perform worse with a larger range of weights and values?</p></blockquote>
<p>Our GA might lose to the greedy algorithm from time to time, but that&#8217;s ok. The GA will continue to perform well as complexity increases, but the greedy algorithm will not.</p>
<p>So we&#8217;re still tackling a pretty simplified and contrived problem here, but it&#8217;s certainly more complex and useful than our &#8220;Hello, World!&#8221; from the last article. Let&#8217;s get started.</p>
<h2>Key Differences From &#8220;Hello World&#8221;</h2>
<p>There are some major differences between our problem and the previous &#8220;Hello, World!&#8221;:</p>
<ul>
<li>We can&#8217;t use elements (genes) more than once, while &#8220;Hello, World!&#8221; can (must!) use letters more than once.</li>
<li>In the &#8220;Hello, World!&#8221; example, we knew that the string needed to be 13 characters long. Here we don&#8217;t know how many elements to take.</li>
<li>We don&#8217;t know the highest possible value (&#8220;fitness&#8221; or &#8220;score&#8221;) for this problem. It could be $4,901 like the greedy algorithm guessed, or it could be $10,000, or $23,304</li>
</ul>
<h2>Representing the Chromosome</h2>
<p>The &#8220;Hello, World!&#8221; algorithm represented the chromosome as a string. Mutating involved randomly changing a letter. Mating (or crossover, as it&#8217;s really called) involved splicing two strings together at the halfway point. We need to do things a little differently here.</p>
<p>It turns out that representing our solution requires a little more finesse than &#8220;Hello, World!&#8221;. Since we don&#8217;t know how many elements to choose, we can&#8217;t use a fixed length string. (Just the first of our problems!)</p>
<p>Instead, I propose we use a &#8220;bitmask&#8221; of sorts. We don&#8217;t have to use an actual bitmask, but my proposal is to use a representation of <em>all</em> of the available elements, and set each &#8220;present&#8221; or not.</p>
<p>Our chromosome could look like this:</p>
<pre class="code">Helium: present
Hydrogen: not present
Lithium: not present</pre>
<p>And et cetera for all 118 elements. Or if you want to go the bitmask route:</p>
<pre class="code">10000011000001000100000010000010010010000</pre>
<p>Where each bit represents a single element and the value of the bit indicates whether that element is in the knapsack or not.</p>
<p>Additionally, if we were to allow more than one of each element, the representation could look like so:</p>
<pre class="code">Helium: 0
Hydrogen: 4
Lithium: 2</pre>
<p>What would <em>not</em> work well is the following:</p>
<pre class="code">In Knapsack: Helium, Lithium, Lead, Tin</pre>
<p>The above makes mating more difficult. You can make it work, but it feels like you&#8217;d be jumping through hoops to pull it off. Structure feels better for this problem.</p>
<p>We have a specific difficulty with mating: we need to make sure that even after mating and mutation we still only have at most one of each element in the list. Using the bitmask approach will help us to that end, but that&#8217;s a common pitfall when trying the list approach above.</p>
<blockquote><p>The difficulty of making sure something happens only once in a chromosome is a common one. If you&#8217;re familiar with the traveling salesman problem, it&#8217;s easy to imagine a scenario where you mate two solutions and end up visiting the same city twice. That city appeared in the first half of the first parent and the second half of the second parent &#8212; therefore appearing nowhere in one child but twice in the other child.</p></blockquote>
<h2>Overweight Populations</h2>
<p>For this problem we&#8217;re going to keep track of three properties of the population: weight, value, and <em>score</em>.</p>
<p>Score is the same thing as value, with one difference: score accounts for the population being overweight.</p>
<p>You may be tempted to throw out overweight populations completely. It&#8217;s a natural instinct because overweight solutions are not acceptable solutions! But there&#8217;s a good, practical reason we don&#8217;t want to throw out overweight chromosomes: there will be some slightly overweight (1,001 pounds) chromosomes that have very high values and just need to be &#8220;tweaked&#8221; a little to bring them within the weight range.</p>
<p>There could be a lot of potential in some of the overweight chromosomes. Rather than killing them, we&#8217;ll penalize them just enough that they still get to reproduce, but are unlikely to be the #1 pick. That&#8217;s what we&#8217;ll use &#8220;score&#8221; for. If you&#8217;re underweight, then your score is just your total value. If you&#8217;re overweight, however, we&#8217;ll penalize you 50 points for every pound over weight. Feel free to play with this number.</p>
<p>Evolutionarily, this &#8220;encourages&#8221; promising chromosomes to drop some weight. All they need is a little tweaking. There&#8217;s no use in throwing out a potentially strong candidate!</p>
<p>We&#8217;ll cover the specifics of mating and mutation and the important of death (really called &#8220;elitism&#8221;) as we&#8217;re looking at the code.</p>
<h2>The Code</h2>
<p>First, let&#8217;s take a look at the data set. I wrote a simple PHP script that generates random weights and values (ranged 1 &#8211; 500) for each element and outputs the set as JSON. It looks something like this:</p>
<pre class="code">"Hydrogen":{
	"weight":389,
	"value":400},
"Helium":{
	"weight":309,
	"value":380},
"Lithium":{
	"weight":339,
	"value":424},
"Beryllium":{
	"weight":405,
	"value":387},
"Boron":{
	"weight":12,
	"value":174},</pre>
<p>And so on.</p>
<p>We then define three quick and easy helper functions:</p>
<pre class="code">function length(obj) {
	var length = 0;
	for (var i in obj)
		length++;
	return length;
}

function clone(obj) {
	obj = JSON.parse(JSON.stringify(obj));
	return obj;
}

function pickRandomProperty(obj) {
    var result;
    var count = 0;
    for (var prop in obj)
        if (Math.random() &lt; 1 / ++count)
           result = prop;
    return result;
}</pre>
<p>The &#8216;length&#8217; property only exists for javascript arrays, so we create a length() function that works for objects.</p>
<p>We create a clone function that ensures our element objects aren&#8217;t passed by reference.</p>
<p>Finally, we create a function that picks a random property of an object. This is an analog to PHP&#8217;s &#8216;array_rand&#8217; function, which returns a random array key.</p>
<h2>Chromosome Functions</h2>
<pre class="code">var Chromosome = function(members) {
	this.members = members;
	for (var element in this.members)
	{
		if (typeof this.members[element]['active'] == 'undefined')
		{
			this.members[element]['active'] = Math.round( Math.random() );
		}
	}
	this.mutate();
	this.calcScore();
};

Chromosome.prototype.weight = 0;
Chromosome.prototype.value = 0;
Chromosome.prototype.members = [];
Chromosome.prototype.maxWeight = 1000;
Chromosome.prototype.mutationRate = 0.7;
Chromosome.prototype.score = 0;</pre>
<p>The chromosome constructor takes an object of &#8216;members&#8217;. In this case, we&#8217;ll either be passing our original list of elements data when we&#8217;re creating a brand new chromosome, <em>or</em> we&#8217;ll be passing in the results of a mating operation.</p>
<p>The constructor randomly activates elements if the &#8216;active&#8217; property isn&#8217;t yet defined. The end result is that this will create a random chromosome if we&#8217;re creating one from scratch, and it&#8217;ll leave a pre-configured chromosome alone.</p>
<p>The prototype also specifies some defaults. The mutationRate property is the chance that a chromosome will mutate.</p>
<pre class="code">Chromosome.prototype.mutate = function() {
	if (Math.random() &gt; this.mutationRate)
		return false;
	var element = pickRandomProperty(this.members);
	this.members[element]['active'] = Number(! this.members[element]['active']);
};</pre>
<p>The mutate method is most similar to the &#8220;Hello, World!&#8221; example. If the chromosome is to mutate then we simply pick an element at random and toggle its &#8216;active&#8217; property. I cast to Number here. It would have been more semantic to cast the Math.random() in the constructor to Boolean. I&#8217;ll ignore this, as I&#8217;ve already pasted all the code into this post.</p>
<pre class="code">Chromosome.prototype.calcScore = function() {
	if (this.score)
		return this.score;

	this.value = 0;
	this.weight = 0;
	this.score = 0;

	for (var element in this.members)
	{
		if (this.members[element]['active'])
		{
			this.value += this.members[element]['value'];
			this.weight += this.members[element]['weight'];
		}
	}

	this.score = this.value;

	if (this.weight &gt; this.maxWeight)
	{
		this.score -= (this.weight - this.maxWeight) * 50;
	}

	return this.score;
};</pre>
<p>The calcScore method starts with a tiny performance optimization: if we&#8217;ve calculated the score already, just serve the cached score &#8212; it&#8217;s just a nice way to not have to worry about at which point in the chromosome life cycle to calculate the score.</p>
<p>We then look through the elements and add up the value and weights for the active ones. We then apply a penalty of 50 points per overweight pound.</p>
<pre class="code">Chromosome.prototype.mateWith = function(other) {
	var child1 = {};
	var child2 = {};
	var pivot = Math.round( Math.random() * (length(this.members) - 1) );
	var i = 0;
	for (var element in elements)
	{
		if (i &lt; pivot)
		{
			child1[element] = clone(this.members[element]);
			child2[element] = clone(other.members[element]);
		}
		else
		{
			child2[element] = clone(this.members[element]);
			child1[element] = clone(other.members[element]);
		}
		i++;
	}

	child1 = new Chromosome(child1);
	child2 = new Chromosome(child2);

	return [child1, child2];
};</pre>
<p>In the &#8220;Hello, World!&#8221; example we picked the center point as the pivot point when mating two chromosomes; in this example we pick a random point instead.<br />
This adds a little more randomness to the system and can help avoid local optima.</p>
<p>Once we&#8217;ve picked our pivot point we create two children by splicing the parents at the pivot and combining. We then use our chromosome constructor to generate chromosome objects and return them.</p>
<h2>The Population</h2>
<pre class="code">var Population = function(elements, size)
{
	if ( ! size )
		size = 20;
	this.elements = elements;
	this.size = size;
	this.fill();
};

Population.prototype.elitism = 0.2;
Population.prototype.chromosomes = [];
Population.prototype.size = 100;
Population.prototype.elements = false;</pre>
<p>The population constructor is straightforward: we give it the master list of elements and the desired population size. We also define the &#8216;elitism&#8217; parameter; this is the percentage of chromosomes that will survive from one generation to the next.</p>
<pre class="code">Population.prototype.fill = function() {
	while (this.chromosomes.length &lt; this.size)
	{
		if (this.chromosomes.length &lt; this.size / 3)
		{
			this.chromosomes.push( new Chromosome( clone(this.elements) ) );
		}
		else
		{
			this.mate();
		}
	}
};</pre>
<p>We use the fill method to initialize the population; we&#8217;ll also use it to fill the population after killing the weakest chromosomes. A little bit of logic determines whether we should create random chromosomes or fill the population through mating instead. If our population size is 20, the first 6 chromosomes will be random and the remaining will be generated by mating. If the population size ever dips below 30% (perhaps due to death/elitism), new random chromosomes will be created until the population is diverse enough to create babies through mating.</p>
<p>Yes, &#8216;this.chromosomes.length&#8217; in a while loop is bad form. If you expect to use a large population size &#8212; or want this to be highly optimized &#8212; do this the right way and cache the length.</p>
<pre class="code">Population.prototype.sort = function() {
	this.chromosomes.sort(function(a, b) { return b.calcScore() - a.calcScore(); });
};

Population.prototype.kill = function() {
	var target = Math.floor( this.elitism * this.chromosomes.length );
	while (this.chromosomes.length &gt; target)
	{
		this.chromosomes.pop();
	}
};</pre>
<p>The sort function above is just a helper; note that we use the calcScore method instead of accessing the &#8216;score&#8217; property directly. If the score hasn&#8217;t been calculated by this point, it will be now; if the score was already calculated we just use calcScore as an accessor.</p>
<p>After sorting, the kill method removes the weakest chromosomes from the bottom of the list by popping them until we reach our elitism value.</p>
<pre class="code">Population.prototype.mate = function() {
	var key1 = pickRandomProperty(this.chromosomes);
	var key2 = key1;

	while (key2 == key1)
	{
		key2 = pickRandomProperty(this.chromosomes);
	}

	var children = this.chromosomes[key1].mateWith(this.chromosomes[key2]);
	this.chromosomes = this.chromosomes.concat(children);
};</pre>
<p>The mate method is always called after the kill method, so the only chromosomes allowed to reproduce are the elite ones in the population (in our example, the best 20%). Rather than mating only the best two chromosomes (like we did in &#8220;Hello, World!&#8221;), we pick any two random chromosomes to mate &#8212; with the exception that we won&#8217;t mate a chromosome with itself.</p>
<p>Again, this serves to add a little more randomness to the system, and will avoid stasis if the top two chromosomes remain the same for many generations &#8212; we see that happen sometimes in the &#8220;Hello, World!&#8221; example and fix it here.</p>
<pre class="code">Population.prototype.generation = function(log) {
	this.sort();
	this.kill();
	this.mate();
	this.fill();
	this.sort();
};</pre>
<p>Then we define a &#8220;generation&#8221;. The generation starts by sorting the chromosomes in terms of score. We then kill the weakest members.</p>
<p>Then we do something a little intriguing: we call mate() once and then call fill(), which we know will also call mate(). The reason we call mate() explicitly is as a bit of insurance: if the elitism parameter is less than 0.3 we want to mate at least once before potentially &#8220;polluting&#8221; the population with random members. This really depends on the elitism value; if you keep it over 0.3 you don&#8217;t have to call mate() explicitly, because fill() will do that for you. But if you have elitism = 0.2 like we do, then we want to run at least one mating routine that involves only the elite, and not the new random chromosomes we introduce with fill().</p>
<p>Finally, we sort once more at the end of the generation. We could just as easily leave this part out, but it&#8217;s nice to see the chromosomes in order after every generation if you&#8217;re debugging.</p>
<h2>Running and Stopping</h2>
<p>I briefly introduced this idea in the &#8220;Hello, World!&#8221; example: we don&#8217;t know the best possible score in this problem, therefore we won&#8217;t know when to stop.</p>
<p>The technique (one of many) that we&#8217;ll use is to stop when you&#8217;ve had 100 (or 1,000 or 1,000,000) generations with no improvement. We&#8217;ll just call this the &#8220;threshold&#8221; or the &#8220;stop threshold&#8221;.</p>
<pre class="code">Population.prototype.run = function(threshold, noImprovement, lastScore, i) {
	if ( ! threshold )
		threshold = 1000;
	if ( ! noImprovement )
		noImprovement = 0;
	if ( ! lastScore )
		lastScore = false;
	if ( ! i )
		i = 0;

	if (noImprovement &lt; threshold)
	{
		lastScore = this.chromosomes[0].calcScore();
		this.generation();

		if (lastScore &gt;= this.chromosomes[0].calcScore())
		{
			noImprovement++;
		}
		else
		{
			noImprovement = 0;
		}

		i++;

		if (i % 10 == 0)
			this.display(i, noImprovement);
		var scope = this;
		setTimeout(function() { scope.run(threshold, noImprovement, lastScore, i) }, 1);

		return false;

	}
	this.display(i, noImprovement);
};</pre>
<p>The run method is an iterative function. It doesn&#8217;t need to be. The only reason it is in this example is because writing to the DOM in the middle of a fast-moving loop doesn&#8217;t work &#8212; the DOM doesn&#8217;t update until execution is done. <a href="http://stackoverflow.com/questions/8110905/javascript-a-loop-with-innerhtml-is-not-updating-during-loop-execution">It&#8217;s just a javascript thing.</a></p>
<p>To get around the DOM limitation, we use a short setTimeout and have the run method call itself iteratively until we&#8217;re done. In general, however, this function could just use a while loop instead of calling itself &#8212; but in that case you&#8217;d either need to use console.log or just wait until the loop is done to watch the results.</p>
<p>Other than that DOM messiness, the run method is straightforward. We compare last generation&#8217;s best score to this generation&#8217;s best. If there was improvement, then we reset the &#8216;noImprovement&#8217; counter. If we have noImprovement equal to our stop threshold, we stop.</p>
<p>Not shown is a simple display method used to print the results to a table on the page. We only call it every 10 generations, and then once again when we&#8217;re done.</p>
<h2>In Action</h2>
<p>Our JSFiddle is below. You&#8217;re given a slider that lets you set the stop tolerance and a big &#8220;Go&#8221; button. You then get to watch the population&#8217;s evolution.</p>
<p>Even with a 10-generation stop threshold, the GA consistently beats the naive greedy algorithm, though this is to be expected.</p>
<p>At a 50-generation stop threshold, the GA also consistently beats the weighted greedy algorithm. This is a happy result. There are some datasets where the GA will never beat the greedy algorithm, and other datasets where the greedy algorithm doesn&#8217;t perform well at all. This seems like an in-between. We don&#8217;t <em>crush</em> the greedy algorithm, but we do outperform it by a significant margin.</p>
<p>The best score I&#8217;ve observed so far is 5944, with a weight of 992. Please let me know in the comments below if you find a better score, and I&#8217;ll post the updates here.</p>
<p><em>Update: </em>Sylvain Zimmer below found a solution with a score of 5968 and weight 998.</p>
<p>As always, feel free to fork and play with this example. I would encourage you to experiment with different values of population size, elitism, and mutation rate and observe the results.</p>
<p><iframe id="fiddle" style="width: 100%; height: 300px; border: 1px solid #CCC;" src="http://jsfiddle.net/bkanber/YAQzr/embedded/result,js,html,css/" frameborder="0" width="320" height="240"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=3f5451bda5" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new ML in JS articles are posted</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: Would you buy an ML in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6613][1]" id="mce-group[6613]-6613-0"><label for="mce-group[6613]-6613-0">I would pay $10 for a DRM-free e-book with tons of ML lessons and JS examples.</label>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/">Machine Learning: Genetic Algorithms in Javascript Part 2</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Opinion: Preparing for the Future of Technology</title>
		<link>http://burakkanber.com/blog/preparing-for-the-future-of-technology/</link>
		<comments>http://burakkanber.com/blog/preparing-for-the-future-of-technology/#comments</comments>
		<pubDate>Tue, 11 Sep 2012 14:00:00 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=7</guid>
		<description><![CDATA[<p>This post is a little different from my others, in that it&#8217;s pure unadulterated speculation. We&#8217;ve made some crazy progress in technology, particularly in the last 20 years. What will the next 20 (or even 10) years bring? And most importantly, how do we prepare ourselves for it? Right now there are engineers and scientists...</p><p>The post <a href="http://burakkanber.com/blog/preparing-for-the-future-of-technology/">Opinion: Preparing for the Future of Technology</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>This post is a little different from my others, in that it&#8217;s pure unadulterated speculation.</p>
<p>We&#8217;ve made some crazy progress in technology, particularly in the last 20 years. What will the next 20 (or even 10) years bring? And most importantly, how do we prepare ourselves for it?</p>
<p>Right now there are engineers and scientists working on what we once thought were ridiculous technologies. E-textiles that have a cell phone embedded in the fabric and beam the audio straight into your ear. Google Glass and augmented reality. Siri and speed recognition. Machine learning everywhere.</p>
<p>I feel kind of left out in the face of all of this. The lowly web software engineer. The hardware engineers are doing all the cool stuff! At first I was a little disappointed that I didn&#8217;t pursue a career in hardware, but then I realized: once the hardware guys do their job, it&#8217;s on the web developers to make it <em>useful.</em> When AR glasses become popular, web designers and engineers will have to figure out how to create a real user experience. Sure, at first all the software will be proprietary, but eventually it&#8217;ll move to the web. That&#8217;s just the way it goes. And that brings up some interesting challenges: after all, how do you browse the web in a pair of glasses?</p>
<p>The web won&#8217;t be the same as it is now. Ten years from now there won&#8217;t be keyboards or mice. They&#8217;ll eventually be replaced by a Javascript Speech Recognition API bundled with HTML7 or something. New cell phones will sit in your pocket and somehow understand gestures, and beam audio to your face through your clothes. Project Glass will have brain sensors that take thought commands &#8212; maybe. (Google would love to monetize our thoughts, I&#8217;m sure.) It sounds ridiculous, but I bet all of what I just mentioned will happen in the next decade (or two).</p>
<p>Siri can already understand you speak, and she&#8217;ll only get better at it. Eventually she&#8217;ll be able to browse Facebook for you, read your stream out loud, and take instructions. (I&#8217;m already calling Siri &#8220;she&#8221;!) But it would be silly to have to build technologies from scratch to do this. It&#8217;ll all start with proprietary tech (the iOS Facebook app), but eventually the W3C will just have HTML specifications for how Siri should read your webpage out loud. There&#8217;ll be a spec for taking speech commands and gestures.</p>
<p>What does this mean for designers and engineers? It means that if you don&#8217;t want to be left behind, you need to start <em>thinking </em>about this stuff. Start by opening up your mind to the possibilities of interface-less human/computer interaction. Every once in a while, you should be doing a thought experiment: how will people interact with websites 5, 10, or 20 years from now? Maybe your webcam will have eyeball tracking built in, and all you have to do is look at a link to follow it. Maybe physical keyboards will be replaced by on-screen keyboards, and all you have to do is look at the right letters until it can auto-complete the word you want.</p>
<p>We don&#8217;t know what the web is going to look like in the future, but we all know it&#8217;s going to change. You <em>can </em>teach an old dog new tricks, but I think we need to start mentally and creatively preparing ourselves for some sort of paradigm shift in the upcoming years.</p>
<blockquote><p>
&#8220;Grandpa, how did you used to go on Facebook?&#8221;</p>
<p>&#8220;Well, Jimmy, we had this thing called a keyboard, and we had to type in &#8216;facebook.com&#8217; and then put in our email address and password.&#8221;</p>
<p>&#8220;Keyboard?&#8221;</p>
<p>&#8220;It&#8217;s something we used to type letters into a computer.&#8221;</p>
<p>&#8220;You had to type letters?!&#8221;
</p></blockquote>
<p>Prepare yourselves.</p>
<p>&nbsp;</p>
<p>The post <a href="http://burakkanber.com/blog/preparing-for-the-future-of-technology/">Opinion: Preparing for the Future of Technology</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/preparing-for-the-future-of-technology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Physics in Javascript: Car Suspension &#8211; Part 1 (Spring, Mass, Damper)</title>
		<link>http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/</link>
		<comments>http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/#comments</comments>
		<pubDate>Mon, 10 Sep 2012 04:22:30 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Modeling Physics]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=191</guid>
		<description><![CDATA[<p>Our big project -- our goal -- for this mechanics/dynamics portion of <a title="Modeling Physics in Javascript: Introduction" href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/">Modeling Physics in Javascript</a> is to model a car's suspension system. It'll take us three non-consecutive articles to get there, but it's a worthy system to model. If you can model a car suspension then it's safe to say you know how to model dynamics systems.

Today, we'll be talking about arguably the most important set of building blocks there is: the spring-mass-damper.</p><p>The post <a href="http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/">Physics in Javascript: Car Suspension &#8211; Part 1 (Spring, Mass, Damper)</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>Our big project &#8212; our goal &#8212; for this mechanics/dynamics portion of <a title="Modeling Physics in Javascript: Introduction" href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/">Modeling Physics in Javascript</a> is to model a car&#8217;s suspension system. It&#8217;ll take us three non-consecutive articles to get there, but it&#8217;s a worthy system to model. If you can model a car suspension then it&#8217;s safe to say you know how to model dynamics systems.</p>
<p>Today, we&#8217;ll be talking about arguably the most important set of building blocks there is: the spring-mass-damper.</p>
<p>The spring-mass-damper is important to learn because it&#8217;s everywhere in mechanics. Superficially, the system is just a block connected to a wall by a spring and a damper (we&#8217;ll talk about what a damper is in a bit). But as you can imagine, physics is a can of worms &#8212; and there&#8217;s a lot more to this simple system than meets the eye.</p>
<h2>The Spring and Hooke&#8217;s Law</h2>
<p>The humble spring is one of the three most important mechanical components to know how to model. Why is the spring so important? How often do you actually see a spring? Sure, they&#8217;re in lots of mechanical devices, like your car and your door lock and your clicky pen, but it turns out that they&#8217;re a lot more important than their use in mechanical devices.</p>
<p>The reason springs are so important is because there&#8217;s nothing in the real universe that&#8217;s &#8220;infinitely rigid&#8221;. Every object in the universe is a little bit springy, even if you can&#8217;t tell by eye. </p>
<p>Imagine you&#8217;re holding a stick from one end and you rest the other end against a cup on a table. You want to push the cup across the table, so you start pushing the stick with your hand. As it turns out, the cup does <em>not </em>start moving the instant you start moving your hand. There&#8217;s a tiny delay &#8212; a fraction of a second &#8212; before the cup starts moving.</p>
<p>That&#8217;s because the stick isn&#8217;t &#8220;infinitely rigid&#8221; &#8212; the molecules in the stick don&#8217;t all move at once. The stick is actually a spring! If you now re-imagine that scenario with a big, floppy spring, you can immediately see why the cup doesn&#8217;t start moving instantly: the spring has to compress a little bit first, and only when it compresses enough does the cup start moving. Additionally, the force from your end of the spring takes some time to travel down to the other end of the spring &#8212; think about wiggling a slinky back and forth. You&#8217;re not moving the whole object instantaneously, but rather you&#8217;re sending a little shockwave through the spring.</p>
<p>(By the way, the speed of that shockwave is called the &#8220;speed of sound&#8221;, and every different material has a different speed of sound.)</p>
<p>Back to the stick: it&#8217;s really a spring; a very, very stiff spring (the term &#8220;stiffness&#8221; actually refers to something slightly different, but for this article I&#8217;ll use &#8220;stiffness&#8221; and &#8220;spring constant&#8221; interchangeably). A big floppy spring isn&#8217;t very stiff, and so you can compress it quite a bit before it builds up enough force to move the cup. But imagine a stiffer spring: it needs less compression before the cup moves. Even a metal rod is a spring. You can&#8217;t see it with your eyes or feel it with your hand, but the metal rod <em>does </em>compress a little bit before the other end moves. It&#8217;s just so stiff that you can&#8217;t see the compression with the naked eye.</p>
<p>So not only can actual springs be modeled as springs (mathematically) but so can sticks and rods! They&#8217;re just really stiff springs. And you can even connect individual particles to each other by springs to make a realistic fluid or fabric particle simulator. So as I said: springs are everywhere.</p>
<p>Fortunately for us, the basic method of modeling springs is really easy. It&#8217;s governed by Hooke&#8217;s Law, which says this:</p>
<p><strong>F<sub>spring</sub></strong> = -k<strong>Δx</strong></p>
<p>The force that a spring pushes or pulls with is -k (called the &#8220;spring constant&#8221;, measured in kg / s<sup>2</sup>) times the displacement. Let&#8217;s look a little closer:</p>
<p>The displacement is how much the spring is being stretched or compressed. If a spring wants to be 10 inches long (we&#8217;ll call this &#8220;equilibrium length&#8221; or &#8220;natural length&#8221;), but it&#8217;s being crushed to 8 inches, the displacement is 2 inches.</p>
<p>The spring constant is a characteristic of the spring itself. We&#8217;ll assume that it&#8217;s a constant, but that&#8217;s not true in the real world. The value of &#8220;k&#8221; changes due to things like temperature and if the spring is stretched to an extreme, but the assumption that &#8220;k&#8221; is a constant is usually a good one. There are also springs designed to have a non-constant &#8220;k&#8221;, but we&#8217;ll avoid those too.</p>
<p>The spring constant is also negative; that implies that the spring force <em>fights</em> the direction you&#8217;re displacing the spring. If you stretch a spring, it tries to compress itself. If you compress a spring, it tries to stretch itself. The spring wants to be back to its equilibrium length, and the farther it is from its natural length, the harder it&#8217;ll push back.</p>
<p>In short: a spring <em>resists changes in displacement.</em></p>
<h2>Dashpots and Viscous Damping</h2>
<p>While a spring resists changes in displacement, a damper resists velocity. A spring wants to go back to position 0, while a damper wants to go back to velocity 0. Physically dampers can take many forms, but the one we typically talk about is called a dashpot. The dashpot is a cylinder filled with fluid with a piston that can move through the liquid. You can move the piston through the liquid, but you experience considerable resistance in doing so. <a href="http://en.wikipedia.org/wiki/Dashpot" target="_blank">Check out the Wikipedia article for a nice image</a>.</p>
<p>Why are dampers so important? Just like springs, <em>actual</em> dampers are not used everywhere (in cars we call them &#8220;shock absorbers&#8221;), but the mathematic model <em>is</em> seen everywhere. Dampers often represent the internal frictional losses of a system.</p>
<p>Mathematically, if you had just a perfect mass and a perfect spring and &#8220;plucked&#8221; it in a vacuum, the mass would oscillate back and forth <em>forever</em>, always returning to the same exact two extreme points. But intuition and a lifetime of experience tells us that shouldn&#8217;t happen, even in a vacuum! The spring itself is made of metal, and when the spring flexes the molecules rub up against each other &#8212; that causes friction, and friction takes away from the energy of the sprung mass. The result is that (even in a vacuum) the spring will warm up slightly and the mass will eventually stop moving.</p>
<p>But it&#8217;s really difficult to model the internal material friction of the metal in the spring. So instead, we assume the spring is perfect (&#8220;ideal&#8221;), and add a damper to the system instead. The damper ends up representing the internal losses of the spring. The damper adds a dash of reality to the system, and will make the mass eventually stop moving just like in real life.</p>
<p>The viscous damping force equation is similar to the spring force equation:</p>
<p><strong>F<sub>damping</sub></strong> = -b<strong>v</strong></p>
<p>Instead of using the displacement like we did with Hooke&#8217;s law, we&#8217;re using the velocity. Instead of the spring constant we have -b, called the &#8220;viscous damping coefficient&#8221;, measured in kg / s. Again, we have the negative sign, which signifies that the force acts <em>opposite</em> the velocity. If you&#8217;re moving one direction, damping will slow you down by pulling in the opposite direction.</p>
<h2>Resonant Frequency</h2>
<p>Together, the spring, mass and damper elements form what&#8217;s called a harmonic oscillator, and that system has some interesting properties: most notably, a &#8220;resonant frequency&#8221; (or natural frequency). If you&#8217;ve ever whacked a knife against a table, or tapped a wine glass with a fork and listened to it ring, you&#8217;ve listened to a resonant frequency. Because of the springiness of all materials, every object in the world has some frequency that it &#8220;likes&#8221;. If you tap the object, it&#8217;ll resonate at its so-called &#8220;natural frequency&#8221;.</p>
<p>Additionally, if by some external means, you wiggle the object at that very frequency, you can cause it to vibrate <em>wildly.</em> That&#8217;s what happens when an opera singer breaks a glass with her voice &#8212; she&#8217;s singing that frequency and therefore &#8220;exciting&#8221; the glass at its natural frequency. More impressive is the <a href="http://www.youtube.com/watch?v=3mclp9QmCGs" target="_blank">Tacoma Narrows Bridge Collapse Disaster</a>, caused by the wind exciting the bridge&#8217;s natural frequency. Jump to 1 minute for the oscillation, 3 minutes for the collapse.</p>
<p>If you know the spring stiffness and mass of a spring-mass system, you can calculate the natural frequency. The natural frequency is</p>
<p><em>f</em><sub>n</sub> = ( 1 / 2 π ) * sqrt( k / m )</p>
<p>If there&#8217;s a damper attached to the system, that changes the natural frequency slightly.</p>
<p>If you use external motion to excite a spring-mass-damper system (called forced vibration), the following effects occur:</p>
<ul>
<li>If the external frequency is lower than the natural frequency, the mass will move more or less with the external &#8220;shaker&#8221;</li>
<li>If the external frequency is near the natural frequency, the mass will move much farther from end to end than the external shaker (think of pushing somebody on a swing), but the mass will also be 90 degrees out of phase with the shaker.</li>
<li>If the external frequency is higher than the natural frequency, the mass will move less far than the shaker, and will also be 180 degrees out of phase with the shaker (ie, moving in the opposite direction).</li>
</ul>
<p>Let&#8217;s build the code and play with this, so we can finally see these effects in action!</p>
<h2>The Code</h2>
<pre class="code">/* Spring stiffness, in kg / s^2 */
var k = -20;
var spring_length = 180;

/* Damping constant, in kg / s */
var b = -0.5;

/* Block position and velocity. */
var block = {x: 100, v: 0, mass: 0.5};
var wall  = {x: 30,  lx: 30, v: 0, t: 0, frequency: 0};</pre>
<p>Some of the usual setup above. We create a block object and a wall object. We set a spring_length (for the spring&#8217;s equilibrium length), and initialize the spring constant and the damping constant.</p>
<p>Inside our loop:</p>
<pre class="code">var loop = function() {
	/* Move the wall. */
	wall.t += frameRate;
	wall.lx = wall.x;
	wall.x = 30 + 70 * Math.sin(2 * Math.PI * wall.frequency * wall.t);
	wall.v = (wall.x - wall.lx) / frameRate;

	/* Loop continues .... */</pre>
<p>In our example, the spring and damper are connected to a moveable wall on one end and a block on the other end. The code above moves the wall according to the sine function.</p>
<pre class="code">	/* ... continued from above . */

	/* Move the block. */
	if ( ! mouse.isDown )
	{
		var F_spring = k * ( (block.x - wall.x) - spring_length );
		var F_damper = b * ( block.v - wall.v );

		var a = ( F_spring + F_damper ) / block.mass;
		block.v += a * frameRate;
		block.x += block.v * frameRate;
	}</pre>
<p>There&#8217;s not too much magic here. We learned how to use forces in the <a title="Modeling Physics in Javascript: Gravity and Drag" href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/">Gravity and Drag</a> article before this one.</p>
<p>We first calculate the spring force by figuring out the spring displacement: the distance between the wall and block at this moment minus the equilibrium length of the spring. Then we multiply that by the spring constant, which is a negative number.</p>
<p>The damping force is calculated similarly, except using the velocities of the wall and block instead of position.</p>
<p>Not shown is a bunch of code to draw the wall and block. There&#8217;s also some mouse handling: in this example, you can grab the block with the mouse.</p>
<p>In order to get a good feel for what goes on with spring-mass-damper systems, please try the following procedure:</p>
<ul>
<li>Grab the block with the mouse, pull it to the side, and let go. The block will oscillate and die out.</li>
<li>Increase stiffness to -60 kg / s<sup>2</sup>, decrease damping to 0 and pluck the block. The block will oscillate (almost) forever. It won&#8217;t go forever because of the mathematical solving technique we&#8217;re using &#8212; more on that next week!</li>
<li>Increase damping to -10 kg / s, pluck the block. There should be no oscillation.</li>
<li>Set the following: stiffness = -15, damping = -0.3, mass = 0.2</li>
<li>Set the frequency to 0.5 Hz &#8212; observe that the mass moves more or less with the wall</li>
<li>Set the frequency to 1.5 Hz &#8212; observe the wild resonant oscillation</li>
<li>Set the frequency to 3.0 Hz &#8212; observe that the mass moves less than the wall, and also in the opposite direction</li>
</ul>
<p>What does this mean for a car suspension? Imagine the wall as the road and wheel, and the mass as the rest of the car. If you drive slowly over big, loping bumps, the whole car moves up and down. If you drive quickly over rapid bumps, the tire will bounce up and down a lot but the car will stay more or less in place. </p>
<p>Next week we&#8217;ll be talking about different methods of solving our equations, and hopefully get something that feels a little smoother.</p>
<p>Here&#8217;s the JSFiddle:</p>
<p><iframe style="width: 100%; height: 450px; border: 1px solid #CCC;" src="http://jsfiddle.net/bkanber/pDngH/embedded/result,js,html,css/" frameborder="0"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<style type="text/css">
	#mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
	#mc_embed_signup form { padding: 0 }
</style>
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=73795f99d0" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new Physics in JS articles are written</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: would you buy a Physics in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6713][1]" id="mce-group[6713]-6713-0"><label for="mce-group[6713]-6713-0">I would pay $10 for a DRM-free e-book with tons of Physics in JS.</label></p>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/">Physics in Javascript: Car Suspension &#8211; Part 1 (Spring, Mass, Damper)</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Effective Teaching is a Long Con</title>
		<link>http://burakkanber.com/blog/effective-teaching-is-a-long-con/</link>
		<comments>http://burakkanber.com/blog/effective-teaching-is-a-long-con/#comments</comments>
		<pubDate>Sat, 08 Sep 2012 22:17:48 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=182</guid>
		<description><![CDATA[<p>If you wish to make an apple pie from scratch, you must first invent the universe.
- Carl Sagan

The above Carl Sagan quote is why most smart people are bad teachers. Smart, impassioned people ﻿know the whole story﻿ behind what they're teaching. Smart people like to think about the whole story; they run it through their heads over and over in the shower, while they're cooking, and on the subway. Smart people like to tell other people the whole story, and end up re-inventing the universe when all they set out to do was teach a lesson on a single topic. </p><p>The post <a href="http://burakkanber.com/blog/effective-teaching-is-a-long-con/">Effective Teaching is a Long Con</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<blockquote><p>If you wish to make an apple pie from scratch, you must first invent the universe.</p>
<p>- Carl Sagan</p></blockquote>
<p>The above Carl Sagan quote is why most smart people are bad teachers. Smart, impassioned people <em>know the whole story</em> behind what they&#8217;re teaching. Smart people like to think about the whole story; they run it through their heads over and over in the shower, while they&#8217;re cooking, and on the subway. Smart people like to tell other people the whole story, and end up re-inventing the universe when all they set out to do was teach a lesson on a single topic.</p>
<p>(Of course, not all smart people are bad teachers. Being smart and being a good teacher are two separate skills. While being smart certainly can help a teacher&#8217;s skills, one can not get by as a teacher on being smart alone.)</p>
<p>Telling the whole story is frustrating to students; it&#8217;s too much to take in all at once. The smart person tends to forget that they learned the whole story over years of intense learning. It&#8217;s hard for the smart person to let go of pieces of the story &#8212; to consciously omit them &#8212; even if they&#8217;re not of immediate importance.</p>
<p>I wrote my first Physics in Javascript article on <a title="Modeling Physics in Javascript: Gravity and Drag" href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/">Gravity and Drag</a> the other day. Part of the article talks briefly about calculus and the derivative. I love physics and I love calculus so it was very tempting to tell the whole story. I wish I could have gone in-depth about what integration <em>really is</em>, I wish I could have talked about numerical ODE solvers and how the solver I implemented is a first-order ODE solver called Euler&#8217;s method&#8230; and what that all means. But it would have been the wrong thing to do.</p>
<p>If you try to tell the whole story you&#8217;ll end up writing a whole damned textbook and lose your audience in the process. You&#8217;ll alienate the people that matter the most: the students. Students are people trying to learn this stuff <em>for the first time </em>and it&#8217;s easy for a teacher to lose sight of that.</p>
<h2>Effective Teaching is a Long Con</h2>
<p>Knowing what <em>not</em> to teach is just as important as knowing what to teach. If you need to introduce a concept to complete a lesson, then do so &#8212; but don&#8217;t get pulled into the beauty of the deeper meaning that <em>you </em>understand, because your students don&#8217;t understand that stuff yet.</p>
<p>Instead, I strive to have my students <em>accidentally learn stuff</em>. Your lessons or lectures or articles should omit the bits that aren&#8217;t of immediate importance, but leave a little trail of breadcrumbs in the process. Drop little clues about those tangential topics you want your students to come to understand. Eventually your students will piece it all together.</p>
<p>&#8220;&#8230; this isn&#8217;t the only way to do this; this is called Euler&#8217;s method but there are some more sophisticated methods&#8230;&#8221;</p>
<p>&#8220;&#8230; electrons orbit the nucleus &#8212; well, almost&#8230; Heisenberg&#8217;s uncertainty principle tells us we can&#8217;t know exactly what they&#8217;re doing &#8230;&#8221;</p>
<p>&#8220;&#8230; this is called a linked list &#8212; there&#8217;s also a doubly-linked list, but we don&#8217;t need that today &#8230;&#8221;</p>
<p>It&#8217;s hard to accept that many of your students will simply ignore the breadcrumbs you drop, but it&#8217;s still the right thing to do. Not going into too much detail keeps your lessons accessible. In future lessons you&#8217;ll leave more breadcrumbs and eventually your students will discover a little puzzle you&#8217;ve left for them.</p>
<p>Why is this my favorite way to teach? The most engaged students will go and research that stuff on their own, and the least engaged students won&#8217;t get bogged down by extraneous information. Worst-case scenario, a student ends up leaving with exactly what you taught them. Best-case, a student leaves the class with <em>more </em>than you gave them. And often enough, your students will subconsciously piece together the subplot you&#8217;ve laid out &#8212; without even doing the extracurricular research themselves.</p>
<p>The breadcrumb method works because we love solving problems and decoding patterns (see Raph Koster&#8217;s &#8220;A Theory of Fun for Game Design&#8221; for some excellent discussion) &#8212; so when the student pieces it together they end up <em>delighted </em>by what they&#8217;ve learned. That&#8217;s more effective than spoon-feeding them the information, and creates a generation of excited learners in the process.</p>
<p>It&#8217;s a long con. After months of lessons, you&#8217;ve tricked your students into learning more than they&#8217;ve bargained for. Most students won&#8217;t even know you conned them. The smart-alecs will even challenge you at times: &#8220;you left out ___ !&#8221; or &#8220;you forgot about friction!&#8221; But that&#8217;s ok &#8212; you&#8217;re not supposed to impress them. You&#8217;re supposed to guide and teach the people who need you to keep <em>their </em>best interests in mind, and you need to do that by whatever means necessary.</p>
<p>It sounds strange, but the morally responsible thing to do is carefully and meticulously plan to con your students. With knowledge. Please don&#8217;t actually con them.</p>
<p>The post <a href="http://burakkanber.com/blog/effective-teaching-is-a-long-con/">Effective Teaching is a Long Con</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/effective-teaching-is-a-long-con/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Machine Learning in JS: k-nearest-neighbor Introduction</title>
		<link>http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/</link>
		<comments>http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/#comments</comments>
		<pubDate>Fri, 07 Sep 2012 18:16:53 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=152</guid>
		<description><![CDATA[<p>Today we're going to look at the k-nearest-neighbor algorithm (we'll abbreviate it kNN throughout this article). I love this algorithm because it's dead simple, but can still solve some exciting problems. Its strength doesn't lie in mathematical or theoretical sophistication, but rather in the fact that it can elegantly handle lots of different input parameters ("dimensions").</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/">Machine Learning in JS: k-nearest-neighbor Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em>This article is part of the <a title="Machine Learning in Javascript: Introduction" href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/">Machine Learning in Javascript</a> series. My goal is to teach ML from fundamental to advanced topics using a common language. Javascript is an excellent choice because it requires no special environment to run, and its lack of ML libraries forces us to learn and write the code from scratch.</em></p>
<p>Just looking for the example code? The <a href="#fiddle">JSFiddle is at the bottom of the page</a>.</p>
<p>Today we&#8217;re going to look at the k-nearest-neighbor algorithm (we&#8217;ll abbreviate it kNN throughout this article). I love this algorithm because it&#8217;s dead simple, but can still solve some exciting problems. Its strength doesn&#8217;t lie in mathematical or theoretical sophistication, but rather in the fact that it can elegantly handle lots of different input parameters (&#8220;dimensions&#8221;).</p>
<p>k-nearest-neighbor also serves an ulterior motive of mine: it&#8217;s a great way to introduce the concept of &#8220;supervised learning&#8221;. So let&#8217;s go ahead and build the k-nearest-neighbor algorithm. And we&#8217;ll graph our results, because I love visuals!</p>
<h2>Supervised Learning</h2>
<p>There are two giant umbrella categories within Machine Learning: supervised and unsupervised learning. Briefly, <em>un</em>supervised learning is like data mining &#8212; look through some data and see what you can pull out of it. Typically, you don&#8217;t have much additional information before you start the process. We&#8217;ll be looking at an unsupervised learning algorithm called k-means next week, so we can discuss that more in the next article.</p>
<p>Supervised learning, on the other hand, starts with &#8220;training data&#8221;. Supervised learning is what we do as children as we learn about the world. We&#8217;re in the kitchen with mom. Mom shows us an apple and says the word &#8220;apple&#8221;. You see an object and your mother has labeled it for you.</p>
<p>The next day, she shows you a different apple. It&#8217;s smaller, it&#8217;s less red, and it has a slightly different shape. But your mother shows it to you and says the word &#8220;apple&#8221;. This process repeats for a couple of weeks. Every day your mother shows you a slightly different apple, and tells you they&#8217;re apples. Through this process you come to understand what an apple is.</p>
<p>Not every apple is exactly the same. Had you only ever seen one apple in your life, you might assume that every apple is <em>identical</em>. But instead, your mother has trained you to recognize the overall features of an apple. You&#8217;re now able to create this category, or label, of objects in your mind. When you see an apple in the future you can recognize it as an apple because you&#8217;ve come to understand that while all apples share some features, they don&#8217;t have to be identical to still be apples.</p>
<p>This is called &#8220;generalization&#8221; and is a very important concept in supervised learning algorithms. We would be useless if we couldn&#8217;t recognize that an iPhone was an iPhone because it had a different case, or because it had a scratch across the screen.</p>
<p>When we build certain types of ML algorithms we therefore need to be aware of this idea of generalization. Our algorithms should be able to generalize but not <em>over</em>-generalize (easier said than done!). We want &#8220;this is red and kind of round and waxy, it must be an apple&#8221; and <em>not </em>&#8220;this is red and round, it must be a ball; this other thing is orange and round, it must be a ball.&#8221; That&#8217;s overgeneralization and can be a problem. Of course, under-generalization is a problem too. This is one of the main difficulties with ML: being able to find the generalization sweet spot. There are some tests you can run as you&#8217;re training an algorithm to help you find the sweet spot, but we&#8217;ll talk about those in the future when we get to more advanced algorithms.</p>
<p>Many supervised learning problems are &#8220;classification&#8221; problems. The classification problem goes like this: there&#8217;s a bucket of apples, oranges, and pears. Each piece of fruit has a sticker that tells you which fruit it is &#8212; except one! Figure out which fruit the mystery fruit is by learning from the other fruits you&#8217;re given.</p>
<p>The classification problem is usually very easy for humans, but tough for computers. kNN is one type of many different classification algorithms.</p>
<h2>Features</h2>
<p>This is a great time to introduce another important aspect of ML: features. Features are what you break an object down into as you&#8217;re processing it for ML. If you&#8217;re looking to determine whether an object is an apple or an orange, for instance, you may want to look at the following features: shape, size, color, waxiness, surface texture, etc. It also turns out that sometimes an individual feature ends up not being helpful. Apples and oranges are roughly the same size, so that feature can probably be ignored and save you some computational overhead. In this case, size doesn&#8217;t really add new information to the mix.</p>
<p>Knowing what features to look for is an important skill when designing for ML algorithms. Sometimes you can get by on intuition, but most of the time you&#8217;ll want to use a separate algorithm to determine which are the most important features of a data set (we&#8217;ll talk about that in a much more advanced article).</p>
<p>As you can imagine, features aren&#8217;t always as straightforward as &#8220;color, size, shape&#8221;. Processing documents is a tricky example. In some scenarios, each <em>word</em> in a document is an individual feature. Or maybe each pair of consecutive words (&#8220;bigrams&#8221;) is a feature. We&#8217;ll talk about document classification in a future article as well.</p>
<h2>The Problem</h2>
<p>Given the number of rooms and area (in square feet) of a type of dwelling, figure out if it&#8217;s an apartment, house, or flat.</p>
<p>As always, we&#8217;re starting with the most contrived possible problem in order to learn the basics. The description of this problem has <em>given </em>us the features we need to look at: number of rooms, and square feet. We can also assume that, since this is a supervised learning problem, we&#8217;ll be given a handful of example apartments, houses, and flats.</p>
<h2>What &#8220;k-nearest-neighbor&#8221; Means</h2>
<p>I think the best way to teach the kNN algorithm is to simply define what the phrase &#8220;k-nearest-neighbor&#8221; actually means.</p>
<p>Here&#8217;s a table of the example data we&#8217;re given for this problem:</p>
<table style="width:100%">
<thead>
<tr>
<th>Rooms
<th>Area
<th>Type</tr>
</thead>
<tbody>
<tr>
<td>1
<td>350
<td>apartment</tr>
<tr>
<td>2
<td>300
<td>apartment</tr>
<tr>
<td>3
<td>300
<td>apartment</tr>
<tr>
<td>4
<td>250
<td>apartment</tr>
<tr>
<td>4
<td>500
<td>apartment</tr>
<tr>
<td>4
<td>400
<td>apartment</tr>
<tr>
<td>5
<td>450
<td>apartment</tr>
<tr>
<td>7
<td>850
<td>house</tr>
<tr>
<td>7
<td>900
<td>house</tr>
<tr>
<td>7
<td>1200
<td>house</tr>
<tr>
<td>8
<td>1500
<td>house</tr>
<tr>
<td>9
<td>1300
<td>house</tr>
<tr>
<td>8
<td>1240
<td>house</tr>
<tr>
<td>10
<td>1700
<td>house</tr>
<tr>
<td>9
<td>1000
<td>house</tr>
<tr>
<td>1
<td>800
<td>flat</tr>
<tr>
<td>3
<td>900
<td>flat</tr>
<tr>
<td>2
<td>700
<td>flat</tr>
<tr>
<td>1
<td>900
<td>flat</tr>
<tr>
<td>2
<td>1150
<td>flat</tr>
<tr>
<td>1
<td>1000
<td>flat</tr>
<tr>
<td>2
<td>1200
<td>flat</tr>
<tr>
<td>1
<td>1300
<td>flat</tr>
</tbody>
</table>
<p>We&#8217;re going to plot the above as points on a graph in two dimensions, using number of rooms as the x-axis and the area as the y-axis.</p>
<p>When we inevitably run into a new, unlabeled data point (&#8220;mystery point&#8221;), we&#8217;ll put that on the graph too. Then we&#8217;ll pick a number (called &#8220;k&#8221;) and just find the &#8220;k&#8221; closest points on the graph to our mystery point. If the majority of the points close to the new point are &#8220;flats&#8221;, then we&#8217;ll guess that our mystery point is a flat.</p>
<p>That&#8217;s what k-nearest-neighbor means. &#8220;If the 3 (or 5 or 10, or &#8216;k&#8217;) nearest neighbors to the mystery point are two apartments and one house, then the mystery point is an apartment.&#8221;</p>
<p>Here&#8217;s the (simplified) procedure:</p>
<ul>
<li>Put all the data you have (including the mystery point) on a graph.
<li>Measure the distances between the mystery point and every other point.
<li>Pick a number. Three is usually good for small data sets.
<li>Figure out what the three closest points to the mystery point are.
<li>The majority of the three closest points is the answer.
</ul>
<p>If you&#8217;re having trouble visualizing this, please take a quick break to scroll down to the bottom of the page and run the JS fiddle. That should illustrate the concept. Then come back up here and continue reading!</p>
<h2>The Code</h2>
<p>Let&#8217;s start building this thing. There are a few fine points that will come out as we&#8217;re implementing the algorithm, so please read the following carefully. If you skim, you&#8217;ll miss out on another important concept!</p>
<p>I&#8217;ll build objects of two different classes for this algorithm: the &#8220;Node&#8221;, and a &#8220;NodeList&#8221;. A Node represents a single data point from the set, whether it&#8217;s a pre-labeled (training) point, or an unknown point. The NodeList manages all the nodes and also does some canvas drawing to graph them all.</p>
<p>The Node constructor has nothing to it. It just expects an object with the properties &#8220;type&#8221;, &#8220;area&#8221;, and &#8220;rooms&#8221;:</p>
<pre class="code">
var Node = function(object) {
	for (var key in object)
	{
		this[key] = object[key];
	}
};
</pre>
<p>When I build these algorithms &#8220;for realsies&#8221; I usually abstract the features a little bit more. This example requires &#8220;area&#8221; and &#8220;rooms&#8221; to be hard-coded in certain places, but I usually build a generalized kNN algorithm that can work with arbitrary features rather than our pre-defined ones here. I&#8217;ll leave that as an exercise for you!</p>
<p>Similarly, the NodeList constructor is simple:</p>
<pre class="code">
var NodeList = function(k) {
	this.nodes = [];
	this.k = k;
};
</pre>
<p>The NodeList constructor takes the &#8220;k&#8221; from k-nearest-neighbor as its sole argument. Please fork the JSFiddle code and experiment with different values of k. Don&#8217;t forget to try the number 1 as well! </p>
<p>Not shown is a simple NodeList.prototype.add(node) function &#8212; that just takes a node and pushes it onto the this.nodes array. </p>
<p>At this point, we could dive right in to calculating distances, but I&#8217;d like to take a quick diversion.</p>
<h2>Normalizing Features</h2>
<p>Look at the data in the table above. The number of rooms varies from 1 to 10, and the area ranges from 250 to 1700. What would happen if we tried to graph this data onto a chart (without scaling anything)? For the most part, the data points would be lined up in a vertical column. That&#8217;ll look pretty ugly, and hard to read.</p>
<p>Unfortunately, that&#8217;s not just an aesthetic problem. This is an issue of a large discrepancy of scale of our data features. The difference between 1 room and 10 rooms is <em>huge</em> when you consider what it means for classifying a &#8220;flat&#8221; vs a &#8220;house&#8221;! But the same difference of 9, when you&#8217;re talking about square feet, is <em>nothing</em>. If you were to measure distances between Nodes right now without adjusting for that discrepancy, you&#8217;d find that the number of rooms would have almost no effect on the results because those points are so close together on the x-axis (the &#8220;rooms&#8221; axis). </p>
<p>Consider the difference between a dwelling with 1 room and 700 square feet and 5 rooms and 700 square feet. Looking at the data table by eye, you&#8217;d recognize the first to be a flat and the second to be an apartment. But if you were to graph these and run kNN, it would consider them both to be flats. </p>
<p>So instead of looking at absolute values of number of rooms and area, we should normalize these values to be between 0 and 1. After normalization, the lowest number of rooms (1) becomes 0, and the largest number of rooms (10) becomes 1. Similarly, the smallest area (250) becomes 0 and the largest area (1700) becomes 1. That puts everything on the same playing field and will adjust for discrepancies of scale. It&#8217;s a simple thing to do that makes all the difference in the world.</p>
<blockquote><p>Pro-tip: you don&#8217;t need to scale things evenly (into a square) like I described above. If area is more important to the problem than the number of rooms, you can scale those two features differently &#8212; this is called &#8220;weighting&#8221;, and gives more importance to one feature or another. There are also algorithms that will determine the ideal feature weights for you. All in due time&#8230;</p></blockquote>
<p>To start normalizing our data we should give NodeList a way of finding the minimum and maximum values of each feature:</p>
<pre class="code">
NodeList.prototype.calculateRanges = function() {
	this.areas = {min: 1000000, max: 0};
	this.rooms = {min: 1000000, max: 0};
	for (var i in this.nodes)
	{
		if (this.nodes[i].rooms < this.rooms.min)
		{
			this.rooms.min = this.nodes[i].rooms;
		}

		if (this.nodes[i].rooms > this.rooms.max)
		{
			this.rooms.max = this.nodes[i].rooms;
		}

		if (this.nodes[i].area < this.areas.min)
		{
			this.areas.min = this.nodes[i].area;
		}

		if (this.nodes[i].area > this.areas.max)
		{
			this.areas.max = this.nodes[i].area;
		}
	}

};
</pre>
<p>As I mentioned earlier, the best approach would be to abstract the features and not have areas or rooms hard-coded. But doing it this way reads a little more clearly to me. </p>
<p>Now that we have our minimum and maximum values, we can move along to the meat and berries of the algorithm. After we&#8217;ve added all our Nodes to the NodeList:</p>
<pre class="code">
NodeList.prototype.determineUnknown = function() {

	this.calculateRanges();

	/*
	 * Loop through our nodes and look for unknown types.
	 */
	for (var i in this.nodes)
	{

		if ( ! this.nodes[i].type)
		{
			/*
			 * If the node is an unknown type, clone the nodes list and then measure distances.
			 */
			
			/* Clone nodes */
			this.nodes[i].neighbors = [];
			for (var j in this.nodes)
			{
				if ( ! this.nodes[j].type)
					continue;
				this.nodes[i].neighbors.push( new Node(this.nodes[j]) );
			}

			/* Measure distances */
			this.nodes[i].measureDistances(this.areas, this.rooms);

			/* Sort by distance */
			this.nodes[i].sortByDistance();

			/* Guess type */
			console.log(this.nodes[i].guessType(this.k));

		}
	}
};
</pre>
<p>That&#8217;s a mouthful. First off, we calculate the min and max ranges so that the NodeList is aware of it.</p>
<p>We then loop through the Nodes and look for any unknown nodes (yes, this can do more than one mystery node at a time). </p>
<p>When we find an unknown Node, we clone the Nodes in the NodeList and make the Node aware of them. The reason we do this is because each unknown Node will have to calculate its own distance to every other Node &#8212; so we can&#8217;t use global state here. </p>
<p>Finally, we call three Node methods in succession on the unknown Node: measureDistances, sortByDistance, and guessType.</p>
<pre class="code">
Node.prototype.measureDistances = function(area_range_obj, rooms_range_obj) {
	var rooms_range = rooms_range_obj.max - rooms_range_obj.min;
	var area_range  = area_range_obj.max  - area_range_obj.min;

	for (var i in this.neighbors)
	{
		/* Just shortcut syntax */
		var neighbor = this.neighbors[i];

		var delta_rooms = neighbor.rooms - this.rooms;
		delta_rooms = (delta_rooms ) / rooms_range;

		var delta_area  = neighbor.area  - this.area;
		delta_area = (delta_area ) / area_range;

		neighbor.distance = Math.sqrt( delta_rooms*delta_rooms + delta_area*delta_area );
	}
};
</pre>
<p>The measureDistances function takes the NodeList&#8217;s set of ranges for min and max rooms and areas. If you&#8217;ve abstracted away from hard-coding features, the argument to this method would just be an array of ranges for features, but here we&#8217;ve hardcoded it. </p>
<p>We quickly calculate the rooms_range (here, it&#8217;ll be 9) and area_range (here, it&#8217;ll be 1450).</p>
<p>Then we loop through our Node&#8217;s neighbors (we gave this Node the neighbors property above when we cloned the Nodes from NodeList). For each one of the neighbors we calculate the difference in number of rooms and area. After each of those calculations, we normalize by dividing each feature by its range.</p>
<p>If the difference between the number of rooms turns out to be 3 and the total range is 9 we end up for 0.333 for the value of that delta. This number should always be between -1 and +1, since we&#8217;ve normalized to a square for this problem.</p>
<p>Finally, we calculate the distance using the Pythagorean theorem. Note that if you have more than 2 features (dimensions), you still keep the Math.sqrt &#8212; you just add all the squared features like so:</p>
<pre class="code">
Math.sqrt( a*a + b*b + c*c + d*d + ... + z*z );
</pre>
<p>It should be pretty clear now what the true strength of this algorithm is. While our feeble minds may only be able to figure out correlations involving 5 or 10 different features, this algorithm can do that for hundreds or thousands of dimensions.</p>
<pre class="code">
Node.prototype.sortByDistance = function() {
	this.neighbors.sort(function (a, b) {
		return a.distance - b.distance;
	});
};
</pre>
<p>This above sortByDistance is just a helper function to sort the Node&#8217;s neighbors by distance. </p>
<pre class="code">
Node.prototype.guessType = function(k) {
	var types = {};

	for (var i in this.neighbors.slice(0, k))
	{
		var neighbor = this.neighbors[i];

		if ( ! types[neighbor.type] )
		{
			types[neighbor.type] = 0;
		}

		types[neighbor.type] += 1;
	}

	var guess = {type: false, count: 0};
	for (var type in types)
	{
		if (types[type] > guess.count)
		{
			guess.type = type;
			guess.count = types[type];
		}
	}

	this.guess = guess;

	return types;
};
</pre>
<p>The final piece of the algorithm is the guessType method. This method accepts the value of &#8220;k&#8221; from NodeList and slices out the k closest neighbors. It then tallies up the types of the neighbors, picks the most common one, and returns the result. </p>
<p>Algorithm finished. Congratulations! Now let&#8217;s figure out how to graph this thing.</p>
<h2>Graphing it With Canvas</h2>
<p>Graphing our results is pretty straightforward, with a few exceptions:</p>
<ul>
<li>We&#8217;ll color code apartments = red, houses = green, flats = blue.</li>
<li>We have to scale the graph into a square (for the same reasons we had to normalize the features).</li>
<li>We&#8217;ll also add a little bit of padding to the edges of the graph.</li>
<li>We&#8217;ll display the result of the kNN guess by drawing the radius that encompasses the &#8220;k&#8221; nearest neighbors, and coloring that radius with the result&#8217;s color.</li>
</ul>
<pre class="code">
NodeList.prototype.draw = function(canvas_id) {
	var rooms_range = this.rooms.max - this.rooms.min;
	var areas_range = this.areas.max - this.areas.min;

	var canvas = document.getElementById(canvas_id);
	var ctx = canvas.getContext("2d");
	var width = 400;
	var height = 400;
	ctx.clearRect(0,0,width, height);

	for (var i in this.nodes)
	{
		ctx.save();

		switch (this.nodes[i].type)
		{
			case 'apartment':
				ctx.fillStyle = 'red';
				break;
			case 'house':
				ctx.fillStyle = 'green';
				break;
			case 'flat':
				ctx.fillStyle = 'blue';
				break;
			default:
				ctx.fillStyle = '#666666';
		}

		var padding = 40;
		var x_shift_pct = (width  - padding) / width;
		var y_shift_pct = (height - padding) / height;

		var x = (this.nodes[i].rooms - this.rooms.min) * (width  / rooms_range) * x_shift_pct + (padding / 2);
		var y = (this.nodes[i].area  - this.areas.min) * (height / areas_range) * y_shift_pct + (padding / 2);
		y = Math.abs(y - height);


		ctx.translate(x, y);
		ctx.beginPath();
		ctx.arc(0, 0, 5, 0, Math.PI*2, true);
		ctx.fill();
		ctx.closePath();
		

		/* 
		 * Is this an unknown node? If so, draw the radius of influence
		 */

		if ( ! this.nodes[i].type )
		{
			switch (this.nodes[i].guess.type)
			{
				case 'apartment':
					ctx.strokeStyle = 'red';
					break;
				case 'house':
					ctx.strokeStyle = 'green';
					break;
				case 'flat':
					ctx.strokeStyle = 'blue';
					break;
				default:
					ctx.strokeStyle = '#666666';
			}

			var radius = this.nodes[i].neighbors[this.k - 1].distance * width;
			radius *= x_shift_pct;
			ctx.beginPath();
			ctx.arc(0, 0, radius, 0, Math.PI*2, true);
			ctx.stroke();
			ctx.closePath();

		}

		ctx.restore();

	}

};
</pre>
<p>I won&#8217;t describe the above in detail, but rather point out two lines that I want you to look at and figure out on your own:</p>
<ul>
<li>The line starting &#8220;var x = &#8221;
<li>The line starting &#8220;var radius = &#8221;
</ul>
<p>It&#8217;s a good but potentially frustrating exercise to try and figure out what&#8217;s going on in those two lines, but please don&#8217;t give up until you understand them!</p>
<p>Finally, we&#8217;ll throw some code together to create random mystery points and do that every 5 seconds:</p>
<pre class="code">
var run = function() {
	nodes = new NodeList(3);
	for (var i in data)
	{
		nodes.add( new Node(data[i]) );
	}
	var random_rooms = Math.round( Math.random() * 10 );
	var random_area = Math.round( Math.random() * 2000 );
	nodes.add( new Node({rooms: random_rooms, area: random_area, type: false}) );

	nodes.determineUnknown();
	nodes.draw("canvas");
};

window.onload = function() {
	setInterval(run, 5000);
	run();
};
</pre>
<p>If you want to play with the value of &#8220;k&#8221;, do it in the run() function above.</p>
<p>The kNN algorithm isn&#8217;t the most sophisticated classifier there is, but it has some excellent uses. What&#8217;s even more exciting is that you don&#8217;t <em>have</em> to use kNN as a classifier; the concepts behind kNN are very flexible and can be used for non-classification problems. Consider the following problems that kNN might be a good candidate for (list includes both classification problems and otherwise):</p>
<ul>
<li>What named color (blue, red, green, yellow, grey, purple, etc) is a given RGB value closest to? This is useful if you have an image search application and want to &#8220;search for purple images&#8221; for instance.</li>
<li>You&#8217;re building a dating site and want to rank matches for a given profile. The features could be location, age, height, and weight, for instance. Pick the 20 closest neighbors and rank them by kNN distance.</li>
<li>Quickly find 5 documents similar to a given document. Features are words in this case. Not the most sophisticated algorithm to solve this problem, but it works in a pinch.</li>
<li>Given data for previous e-shoppers, figure out if the person currently browsing your site is likely to make a purchase. Features could be time of day, number of pages browsed, location, referral source, etc.</li>
</ul>
<h2>Drawbacks, caveats</h2>
<p>There are two issues with kNN I&#8217;d like to briefly point out. First of all, it should be pretty clear that if your training data is all over the place, this algorithm won&#8217;t work well. The data needs to be &#8220;separable&#8221;, or clustered somehow. Random speckles on the graph is no help, and very few ML algorithms can discern patterns from nearly-random data.</p>
<p>Secondly, you&#8217;ll run into performance problems if you have thousands and thousands of Nodes. Calculating all those distances adds up! One way around this is to pre-filter out Nodes outside of a certain feature&#8217;s range. For example, if our mystery point has # of rooms = 3, we might not even calculate distances for points with # of rooms > 6 at all. </p>
<p>Here&#8217;s the JSFiddle. The known points are colored, the mystery point is grey, and the result of the kNN guess is signified by the colored radius around the mystery point. The radius encapsulates the 3 nearest neighbors. A new random mystery point is solved for every 5 seconds:</p>
<p><iframe id="fiddle" style="width: 100%; height: 510px; border:1px solid #CCC;" src="http://jsfiddle.net/bkanber/hevFK/embedded/result,js,html,css/" allowfullscreen="allowfullscreen" frameborder="0"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=3f5451bda5" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new ML in JS articles are posted</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: Would you buy an ML in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6613][1]" id="mce-group[6613]-6613-0"><label for="mce-group[6613]-6613-0">I would pay $10 for a DRM-free e-book with tons of ML lessons and JS examples.</label>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/">Machine Learning in JS: k-nearest-neighbor Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Modeling Physics in Javascript: Gravity and Drag</title>
		<link>http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/</link>
		<comments>http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/#comments</comments>
		<pubDate>Thu, 06 Sep 2012 02:20:56 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Modeling Physics]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=109</guid>
		<description><![CDATA[<p>Yes, there&#8217;s a JSFiddle to play with at the bottom of the page! This is the first post in the Modeling Physics in Javascript series. As such, we&#8217;ll need to cover some introductory material. There are some of you who have real training in physics and calculus. You may be tempted to scream &#8220;that&#8217;s not...</p><p>The post <a href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/">Modeling Physics in Javascript: Gravity and Drag</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em>Yes, there&#8217;s a JSFiddle to play with at the bottom of the page!</em></p>
<p>This is the first post in the <a title="Modeling Physics in Javascript: Introduction" href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/">Modeling Physics in Javascript</a> series. As such, we&#8217;ll need to cover some introductory material. There are some of you who have real training in physics and calculus. You may be tempted to scream &#8220;that&#8217;s not the whole story&#8221; at me when I explain some concepts, and my response to you is this: I wish I could cover all the amazing parts of physics and calculus in one blog post, but I can&#8217;t. I&#8217;ll therefore cover only the bits I need. Hopefully, the people reading this will accidentally learn physics while trying to pick up some cool JS tricks.</p>
<h2>The Problem</h2>
<p>Model a bouncy ball, under the influence of gravity, which can bounce off of horizontal or vertical walls. Oh, and it experiences air drag.</p>
<p>Not too shabby for day 1! We&#8217;ll first talk about Newton&#8217;s 2nd Law, forces, gravity, and a touch of calculus before we can start coding.</p>
<h2>Newton&#8217;s 2nd Law</h2>
<p>Newton&#8217;s 2nd Law of Motion basically <em>is</em> physics. It describes how forces interact with macroscopic objects. If you&#8217;ve ever seen a ball fly through the air, or a car moving, or something rolling down a hill, or if you&#8217;ve ever <em>felt</em> anything, you know (indirectly) about Newton&#8217;s 2nd Law.</p>
<p>Newton&#8217;s 2nd Law looks like this: F = ma.</p>
<p>Well, not quite. It looks more like this: <strong>F<sub>net</sub></strong> = m<strong>a</strong>.</p>
<p>As you can imagine, physics has a manner of starting simple and then getting very complicated very quickly. The subscript &#8220;net&#8221; means that the &#8220;F&#8221; doesn&#8217;t just refer to a single force, but the sum total of all forces on the object. The bold &#8220;F&#8221; and &#8220;a&#8221; mean that those variables (force and acceleration) are vectors &#8212; meaning that they exist separately in 3 different directions (X, Y, and Z).</p>
<p>But that&#8217;s not too bad. If you don&#8217;t want to do vector math, you can simply rewrite <strong>F<sub>net</sub></strong> = m<strong>a</strong> as the following three separate equations:</p>
<ul>
<li>F<sub>x, net</sub> = ma<sub>x</sub></li>
<li>F<sub>y, net</sub> = ma<sub>y</sub></li>
<li>F<sub>z, net</sub> = ma<sub>z</sub></li>
</ul>
<p>You may think that having three equations to worry about makes things tougher, but trust me, it doesn&#8217;t. Just remember that you have to do the F = ma equation for each direction. We&#8217;ll very rarely use the Z direction, so all we really have to worry about stuff happening in the X and Y directions (left/right, and up/down).</p>
<p>Let&#8217;s just jump into a simple example. You have a ball with a mass of 2 kg, and there are three forces acting on it:</p>
<ul>
<li>2 Newtons pulling right</li>
<li>4 N pulling left</li>
<li>1 N pulling down</li>
</ul>
<p>We combine the left and right forces to get our overall F<sub>x, net</sub> = -2 N. (4 to the left and 2 to the right leaves us with 2 left over pulling to the left. The number is negative because in general, we&#8217;ll consider &#8220;right&#8221; to be positive and &#8220;left&#8221; to be negative.)</p>
<p>And since there&#8217;s only one force acting in the Y direction, the F<sub>y, net</sub> is just -1 N. (Let&#8217;s say &#8220;up&#8221; is positive and &#8220;down&#8221; is negative &#8212; but <strong>only for now</strong>. We&#8217;ll switch that up later. In general, you use positive and negative signs in a way that makes sense for your specific problem.)</p>
<p>Finally, since our ball weighs 2 kg, we can substitute our numbers into F = ma and get the following two equations:</p>
<ul>
<li>-2 N = 2 kg * a<sub>x</sub></li>
<li>-1 N = 2 kg * a<sub>y</sub></li>
</ul>
<p>So far, so good. Truth be told, we&#8217;re really interested in the accelerations here, so we&#8217;ll just rearrange:</p>
<ul>
<li>a<sub>x</sub> = -2 N / 2 kg</li>
<li>a<sub>y</sub> = -1 N / 2 kg</li>
</ul>
<p>And then I&#8217;ll tell you that when you divide &#8220;Newtons&#8221; by &#8220;kilograms&#8221; you get &#8220;meters per second-squared&#8221;:</p>
<ul>
<li>a<sub>x</sub> = -1 m / s<sup>2</sup></li>
<li>a<sub>y</sub> = -0.5 m / s<sup>2</sup></li>
</ul>
<p>So we had a situation where we know the mass of an object and the forces acting on it. We added up the forces (assigning negative signs to &#8220;left&#8221; and &#8220;down&#8221;, causing those forces to subtract instead), and rearranged to solve for acceleration &#8212; and we did this for both the X and the Y directions separately. In this example, the ball is accelerating down and to the left, but it&#8217;s accelerating more to the left than it is downwards.</p>
<p>(More advanced: sometimes you don&#8217;t know the X and Y components of a force. It&#8217;s possible that you only know about <em>one</em> force that&#8217;s angled at, say, 30 degrees up from right. In that case, you can use cosine and sine to figure out the X and Y pieces of that force respectively. We&#8217;ll do that in a future article.)</p>
<h2>Gravity</h2>
<p>Gravity is one of the universal fundamental interactions, along with the electromagnetic, strong, and weak forces. Gravity causes a force to act on every single object from every other single object, but fortunately everything works out to simplify nicely if you&#8217;re on the surface of a planet. On Earth, for instance, we don&#8217;t even need to worry about the <em>force </em>that gravity causes, we can just look at the end result: every single object on Earth experiences an additional downward acceleration of 9.8 m / s<sup>2</sup>. That simple. </p>
<p>If our ball from above (that we already solved for) is on Earth and experiencing gravity, then we just need to modify the a<sub>y</sub> equation slightly. We&#8217;ll take the downward acceleration that the force is causing and combine it with the downward acceleration that gravity causes:</p>
<p>a<sub>y</sub> = -0.5 m / s<sup>2</sup> &#8211; 9.8 m / s<sup>2</sup></p>
<p>That leaves us with an acceleration of</p>
<p>a<sub>y</sub> = -10.3 m / s<sup>2</sup></p>
<p>As you can see, the downward force combined with the acceleration from gravity shoots the ball downward even faster than either gravity or the force alone could.</p>
<p>So, gravity is simple (if you&#8217;re on a planet). All you do is modify the a<sub>y</sub> acceleration to factor in the downward pull of gravity. </p>
<p>In a future article we&#8217;ll look at how to model gravity if you&#8217;re <em>not</em> on a planet (perhaps you <em>are</em> a planet), but that doesn&#8217;t come until later&#8230;</p>
<h2>Aerodynamic Drag</h2>
<p>Our ball experiences gravity, but as I mentioned, we don&#8217;t need to model that as a force; we just plug it directly into the acceleration result. Drag, however, <em>is</em> a force, and we&#8217;ll have to model it.</p>
<p>The equation (one of the equations) for aerodynamic drag looks like this:</p>
<p><strong>F<sub>D</sub></strong> = -0.5 * C<sub>D</sub> * A * &#961; * <strong>v<sup>2</sup></strong></p>
<p>It looks complicated, but when we break it down, it&#8217;s pretty simple. First off, notice the bold characters. Like F = ma above, a bold variable means it&#8217;s actually a vector &#8212; so right off the bat we can split this into two equations:</p>
<ul>
<li>F<sub>D, x</sub> = -0.5 * C<sub>D</sub> * A * &#961; * v<sub>x</sub><sup>2</sup></li>
<li>F<sub>D, y</sub> = -0.5 * C<sub>D</sub> * A * &#961; * v<sub>y</sub><sup>2</sup></li>
</ul>
<p>And then we look at each one of those terms above:</p>
<ul>
<li>C<sub>D</sub> is the &#8220;coefficient of drag&#8221;, which is influenced by the shape of the object (and a little bit by its material). For a ball, this is 0.47, and is a dimensionless quantity.</li>
<li>A is the frontal area or frontal projection of the object. If you look at a silhouette of the object from the front, this is the area of that shape. For a ball, the frontal area is just the area of a circle, or &#960; r<sup>2</sup>.</li>
<li>&#961; (Greek letter rho) is the density of the fluid the ball is in. If our ball&#8217;s in air, this value is 1.22 (kg / m<sup>3</sup>)</li>
<li>Velocity squared &#8212; since we&#8217;re looking at this in two directions separately, we use the X velocity and the Y velocity respectively.</li>
<li>Note the -0.5 at the beginning. The negative sign, with the fact that the equation uses velocity, indicates that this force pushes <em>in the opposite direction the ball is moving</em> at all times. Because the velocity is squared it&#8217;ll always be positive, which means the whole equation will always be negative, ie, opposite the velocity.</li>
</ul>
<h2>A Touch of Calculus</h2>
<p>Calculus, like physics, is amazing and has a wonderful depth that I can&#8217;t do justice in a blog post. </p>
<p>The &#8220;derivative&#8221; in calculus describes how something changes as something else is changing; often this will be called the &#8220;rate of change&#8221;. When you drive a car at 30 MPH, your position is changing by 30 miles every hour. Your position changes as time changes. It can then be said that velocity is the &#8220;derivative of position with respect to time&#8221; or simply the &#8220;time derivative of position&#8221;. </p>
<p>Then we can think about what happens when you speed up or slow down (accelerate). You might change your <em>velocity</em> by 5 MPH per hour (MPHPH?). In that sense, your velocity is changing with time, and you can say that acceleration is the time derivative of velocity.</p>
<p>So it starts with position. The derivative of position is velocity. And the derivative of velocity is acceleration. (The derivative of acceleration is called &#8220;jerk&#8221;, and the derivative of jerk is called &#8220;jounce&#8221;.)</p>
<p>Why is this relevant? Because if you know the acceleration of something (5 meters per second per second), and if you know how fast it&#8217;s going when you start looking at it (let&#8217;s say it&#8217;s not moving at all), you can figure out <em>where</em> it will be at every moment in the future.</p>
<p>An example: your ball is accelerating at 2 m / s<sup>2</sup> (meters per second per second). Let&#8217;s say it&#8217;s not moving at all when you start looking at it, and that the starting point is called x = 0;</p>
<table style="width:100%;">
<thead>
<tr>
<th>Time</th>
<th>Accel</th>
<th>Velocity</th>
<th>Position</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 s</td>
<td>2 m / s<sup>2</sup></td>
<td>0 m / s</td>
<td>0 m</td>
</tr>
<tr>
<td>1 s</td>
<td>2 m / s<sup>2</sup></td>
<td>2 m / s</td>
<td>0 m</td>
</tr>
<tr>
<td>2 s</td>
<td>2 m / s<sup>2</sup></td>
<td>4 m / s</td>
<td>2 m</td>
</tr>
<tr>
<td>3 s</td>
<td>2 m / s<sup>2</sup></td>
<td>6 m / s</td>
<td>6 m</td>
</tr>
<tr>
<td>4 s</td>
<td>2 m / s<sup>2</sup></td>
<td>8 m / s</td>
<td>12 m</td>
</tr>
<tr>
<td>5 s</td>
<td>2 m / s<sup>2</sup></td>
<td>10 m / s</td>
<td>20 m</td>
</tr>
<tr>
<td>6 s</td>
<td>2 m / s<sup>2</sup></td>
<td>12 m / s</td>
<td>30 m</td>
</tr>
</tbody>
</table>
<p>And so on. This is the approach we&#8217;ll use when solving for the motion of our ball, except we&#8217;ll do it not once per second but 40 times per second. All we&#8217;re doing is using our knowledge of the forces to figure out the acceleration at every frame. Then we use the acceleration and current velocity to figure out the new velocity. And then we use the velocity and last position to find the current position.</p>
<h2>Writing the Code</h2>
<p>Time to dive in. I won&#8217;t reproduce all the code in snippets, because there&#8217;s some stuff that has nothing to do with physics. You&#8217;ll be able to see the full script at the bottom of the page in the JSFiddle.</p>
<pre class="code">
var frameRate = 1/40; // Seconds
var frameDelay = frameRate * 1000; // ms
var loopTimer = false;

var ball = {
	position: {x: width/2, y: 0},
	velocity: {x: 10, y: 0},
	mass: 0.1, //kg
	radius: 15, // 1px = 1cm
	restitution: -0.7
};

var Cd = 0.47; // Dimensionless
var rho = 1.22; // kg / m^3
var A = Math.PI * ball.radius * ball.radius / (10000);
var ag = 9.81;
</pre>
<p>We set up the frame rate and plug in some physics values. We also create a ball object that stores the ball&#8217;s position, velocity, mass, radius, and a number called &#8220;restitution&#8221;. You&#8217;ll see later that this value determines how bouncy the ball is.</p>
<p>Notice here that we&#8217;ve set the ball to be moving at the start of the simulation.</p>
<p>The best part of programming real physics is the fact that you can look up the density of water and replace the value for rho, and the ball will actually behave as if it&#8217;s in water! If you program the physics correctly, then all you have to do is change the constants and the rest follows. Want the ball to be on the moon? Just change the acceleration due to gravity. Want the ball to swim through water? Just change the density rho. Want a light, floaty beach ball? Lower the mass and increase the radius.</p>
<p><em>Please play with these values</em> in the JSFiddle below. Change rho and the radius and the mass, and see how physics affects the simulation!</em> </p>
<pre class="code">
var setup = function() {
	canvas = document.getElementById("canvas");
	ctx = canvas.getContext("2d");
	
	canvas.onmousemove = getMousePosition;
	canvas.onmousedown = mouseDown;
	canvas.onmouseup = mouseUp;
	
	ctx.fillStyle = 'red';
	ctx.strokeStyle = '#000000';
	loopTimer = setInterval(loop, frameDelay);
}
</pre>
<p>The setup function initializes the canvas and sets up a loop that executes every frameDelay milliseconds. We&#8217;ll do all the physics and animation in the loop function.</p>
<p>In the loop:</p>
<pre class="code">
// Do physics
// Drag force: Fd = -1/2 * Cd * A * rho * v * v
var Fx = -0.5 * Cd * A * rho * ball.velocity.x * ball.velocity.x * ball.velocity.x / Math.abs(ball.velocity.x);
var Fy = -0.5 * Cd * A * rho * ball.velocity.y * ball.velocity.y * ball.velocity.y / Math.abs(ball.velocity.y);
		
Fx = (isNaN(Fx) ? 0 : Fx);
Fy = (isNaN(Fy) ? 0 : Fy);
		
// Calculate acceleration ( F = ma )
var ax = Fx / ball.mass;
var ay = ag + (Fy / ball.mass);

// Integrate to get velocity
ball.velocity.x += ax*frameRate;
ball.velocity.y += ay*frameRate;
		
// Integrate to get position
ball.position.x += ball.velocity.x*frameRate*100;
ball.position.y += ball.velocity.y*frameRate*100;
</pre>
<p>First off, we calculate the drag forces on the ball. There&#8217;s a little trick I used to get the direction of the velocity. Instead of using &#8220;if&#8221; statements to see if the velocity is positive or negative, I just do:</p>
<pre class="code">
ball.velocity.y / Math.abs(ball.velocity.y)
</pre>
<p>at the end of the drag force statements. Dividing a number by its absolute value just leaves the sign. Other than that, the drag force lines are pretty straightforward. We&#8217;re just calculating the forces.</p>
<p>After that, we calculate acceleration. Notice that the statement for &#8220;ay&#8221; is different from &#8220;ax&#8221;. Gravity only works in the Y direction, so we add that in here. Also notice that in this problem, &#8220;downwards&#8221; is positive, unlike the example at the top of the page.</p>
<p>After that, we update the ball velocities with the acceleration times the frame rate. The reason we multiply by the frame rate is so: the acceleration is given in &#8220;meters per <em>second</em>-squared&#8221;. But we&#8217;re calling this loop 40 times a second (not once a second), so we need to divide by 40 (or multiply by 1/40 in this case) to adjust for the frame rate.</p>
<p>Finally, update the ball positions in a similar fashion. In this case we&#8217;re also multiplying by 100. If you look at the ball object definition way above you&#8217;ll see I commented that &#8220;1px = 1cm&#8221;, so this *100 is just an adjustment to make everything work out in meters.</p>
<p>Then we handle collisions with the walls:</p>
<pre class="code">
// Handle collisions
if (ball.position.y > height - ball.radius) {
	ball.velocity.y *= ball.restitution;
	ball.position.y = height - ball.radius;
}
if (ball.position.x > width - ball.radius) {
	ball.velocity.x *= ball.restitution;
	ball.position.x = width - ball.radius;
}
if (ball.position.x < ball.radius) {
	ball.velocity.x *= ball.restitution;
	ball.position.x = ball.radius;
}
</pre>
<p>We're just checking to see if the ball has ended up past the wall in this frame. If it has, then we multiply the velocity in that direction by the restitution coefficient from above. Since that number is always negative, it'll make the ball reverse direction. If you set the restitution to -1, it'll be perfectly bouncy, meaning it'll bounce up as high as it started falling from. If you set the restitution to 0, it'll flop dead on the ground with no bounce whatsoever. And if you set it to something like -2, it'll bounce even higher than it started. Play with it!</p>
<p>We also modify the position of the ball to just kiss the wall -- this way the ball won't get stuck "in" the wall. Keep in mind that the ball is moving in discrete motions, and so when it collides with the wall it's actually overlapping slightly.</p>
<p>Finally, since we want to be able to control the ball with the mouse, we'll add some handlers (not all code shown here):</p>
<pre class="code">
var mouseDown = function(e) {
	if (e.which == 1) {
		getMousePosition(e);
		mouse.isDown = true;
		ball.position.x = mouse.x;
		ball.position.y = mouse.y;
	}
}
var mouseUp = function(e) { 
	if (e.which == 1) {
		mouse.isDown = false;
		ball.velocity.y = (ball.position.y - mouse.y) / 10;
		ball.velocity.x = (ball.position.x - mouse.x) / 10;
	}
}
</pre>
<p>If you click and drag the mouse, and let go, you'll create a kind of slingshot effect. This is achieved not by applying a force to the ball (which you could do), but rather by overriding the velocity of the ball based on how far you pulled the mouse. I like this approach better because it's easier to apply than a force. If you were to use a force to move the ball with the mouse, you'd have to apply the force over a period of time. The "initial velocity" technique above just lets you un-naturally override the velocity in an instant and let physics figure everything out again.</p>
<p>I hope you learned something! Please fork and play with the code in the fiddle below. Click and drag the mouse to slingshot the ball.</p>
<p>Note that certain configurations of variables will cause the simulation to become unstable. Try setting rho = 1000 but leaving the mass at 0.1. The ball should spaz out and blink around the screen. This isn't a problem with the physics, it's just that we're not running at a high enough frame rate for the very large drag forces. To solve that problem we would have to increase the frame rate to make the simulation stable again. Or we could use a different solver (we're using Euler's method here, a first-degree ODE solver) -- but we won't talk about solvers for a few weeks.</p>
<p><iframe style="width: 100%; height: 500px; border:1px solid #CCC;" src="http://jsfiddle.net/bkanber/39jrM/embedded/" allowfullscreen="allowfullscreen" frameborder="0"></iframe></p>
<p><!-- Begin MailChimp Signup Form --></p>
<link href="http://cdn-images.mailchimp.com/embedcode/classic-081711.css" rel="stylesheet" type="text/css">
<style type="text/css">
	#mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
	#mc_embed_signup form { padding: 0 }
</style>
<div id="mc_embed_signup">
<form action="http://burakkanber.us5.list-manage.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=73795f99d0" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank" novalidate>
<h2>Email me when new Physics in JS articles are written</h2>
<div class="mc-field-group">
	<label for="mce-EMAIL">Email Address </label><input type="email" value="" name="EMAIL" class="required email" id="mce-EMAIL">
</div>
<div class="mc-field-group input-group">
    <strong>Survey: would you buy a Physics in JS e-book? </strong><br />
    <input type="checkbox" value="1" name="group[6713][1]" id="mce-group[6713]-6713-0"><label for="mce-group[6713]-6713-0">I would pay $10 for a DRM-free e-book with tons of Physics in JS.</label></p>
</div>
<div id="mce-responses" class="clear">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</p></div>
<div class="clear"><input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/">Modeling Physics in Javascript: Gravity and Drag</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Modeling Physics in Javascript: Introduction</title>
		<link>http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/</link>
		<comments>http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/#comments</comments>
		<pubDate>Wed, 05 Sep 2012 22:02:40 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Modeling Physics]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=84</guid>
		<description><![CDATA[<p>Physics in Javascript Table of Contents: Gravity and Drag Spring-Mass-Damper (Car Suspension, Part 1) Rigid Bodies, Part 1 (Pendulum Clock) Another post that starts with &#8220;I love&#8221;: I love physics. I really do. Once upon a time I was a real engineer (Master in Mechanical Engineering, hybrid car design, control systems, sustainable energy, etc etc...</p><p>The post <a href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/">Modeling Physics in Javascript: Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>Physics in Javascript Table of Contents:</p>
<ul>
<li><a title="Modeling Physics in Javascript: Gravity and Drag" href="http://burakkanber.com/blog/modeling-physics-javascript-gravity-and-drag/" target="_blank">Gravity and Drag</a></li>
<li><a title="Physics in Javascript: Car Suspension – Part 1 (Spring, Mass, Damper)" href="http://burakkanber.com/blog/physics-in-javascript-car-suspension-part-1-spring-mass-damper/" target="_blank">Spring-Mass-Damper (Car Suspension, Part 1)</a></li>
<li><a href="http://burakkanber.com/blog/physics-in-javascript-rigid-bodies-part-1-pendulum-clock/" title="Physics in Javascript: Rigid Bodies — Part 1 (Pendulum Clock)" target="_blank">Rigid Bodies, Part 1 (Pendulum Clock)</a></li>
</ul>
<p>Another post that starts with &#8220;I love&#8221;:</p>
<p>I love physics. I really do. Once upon a time I was a real engineer (Master in Mechanical Engineering, hybrid car design, control systems, sustainable energy, etc etc etc) but decided to go back to the software world. Regardless, I still can&#8217;t get physics out of my head. I read textbooks on quantum mechanics regularly. It&#8217;s a sickness.</p>
<p>The great thing about physics is that it&#8217;s not too hard. Well, maybe it is, but once you understand a handful of core concepts, you can do tons of problems. If you can figure out how a ball moves under the influence of gravity, you can also model the motion of a car&#8217;s suspension. While that sounds like a big stretch, it&#8217;s really not: it&#8217;s all tied together by a few basic concepts.</p>
<p>I also enjoy Javascript. I&#8217;m not a graphics programmer, and have no knowledge of OpenGL or WebGL, but I like Javascript especially because it controls the fancy HTML5 canvas element.</p>
<p>And, if you can combine your rudimentary knowledge of physics with your rudimentary knowledge of Javascript, you can build <strong>really cool physics models and simulations.</strong></p>
<p>With this series, I&#8217;m going to teach you how to model physics. We&#8217;ll use Javascript to do the math and draw the animations to the canvas.</p>
<p>We&#8217;ll start simple, with a bouncy ball and gravity, and we&#8217;ll eventually move on to modeling full systems. It&#8217;s going to be fun. While it&#8217;ll start slow, I want to end up modeling things like bipedal robots within a couple dozen posts. And you&#8217;ll be learning physics in the process.</p>
<p>Buckle up. Here&#8217;s an example of a grandfather clock that actually simulates <em>real physics </em>for the swinging pendulum &#8212; just to whet your appetite:</p>
<p>(PS: the physics in the below example is a little bit fudged, because it was a code-debugging example I used for a class I used to teach; please wait for the full article on the pendulum clock before trying to follow the physics!)</p>
<p><iframe style="width: 100%; height: 700px; border: 1px solid #CCC;" src="http://jsfiddle.net/bkanber/uvYw5/embedded/result,js,html,css/" frameborder="0" width="320" height="240"></iframe></p>
<p>The post <a href="http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/">Modeling Physics in Javascript: Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/modeling-physics-in-javascript-introduction/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Using Gmail and IMAP to create an incoming email system</title>
		<link>http://burakkanber.com/blog/using-gmail-and-imap/</link>
		<comments>http://burakkanber.com/blog/using-gmail-and-imap/#comments</comments>
		<pubDate>Wed, 05 Sep 2012 03:35:50 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=70</guid>
		<description><![CDATA[<p>Today I just want to show off a fun little project I&#8217;m integrating into the Tidal dashboard. This article is applicable to anyone who wants to process a bunch of incoming emails &#8212; potentially addressed to different recipients &#8212; and route them either into user dashboards or support buckets or something similar. Running your own...</p><p>The post <a href="http://burakkanber.com/blog/using-gmail-and-imap/">Using Gmail and IMAP to create an incoming email system</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>Today I just want to show off a fun little project I&#8217;m integrating into the Tidal dashboard. This article is applicable to anyone who wants to process a bunch of incoming emails &#8212; potentially addressed to different recipients &#8212; and route them either into user dashboards or support buckets or something similar.</p>
<p>Running your own email servers is pretty annoying. Outgoing email is a nightmare if you want to show up anywhere outside of the spam folder. Incoming email is a nightmare if you want a highly-available and reliable inbox.</p>
<p>Fortunately, the outbound email problem is solved by providers like SendGrid (my personal favorite). And incoming email: why not just use Gmail? I can&#8217;t tell you the number of people I&#8217;ve interviewed with this very question, and <em>none </em>of them have ever suggested using IMAP to interface with Gmail to receive emails.</p>
<p>And my specific setup goes a step further. I don&#8217;t want just grab incoming emails, I want to grab incoming emails that are intended for an infinite number of different people. I happen to be routing the emails to different dashboards, but that&#8217;s not necessarily what you&#8217;re doing. You may want to forward emails or do some algorithmic processing on them or something.</p>
<p>There are two aspects of Gmail you can leverage to use one account to handle multiple recipients. Choose which one works best for you.</p>
<ul>
<li>I use a Gmail catch-all account for my project. The account is &#8220;support@domain.com&#8221; but any email with a non-existant recipient will end up there. That means that &#8220;client-name@domain.com&#8221; and &#8220;other-client@domain.com&#8221; will both end up in the support catch-all account. This is a great approach if you have clients that require something a little more branded and whitelabeled.</li>
<li>If you have thousands of casual users, you can always use the plus sign trick. If you have an account &#8220;incoming@domain.com&#8221;, you can just tack on some kind of user identifier to the mailbox name: &#8220;incoming+2912390@domain.com&#8221;, &#8220;incoming+bobjones@domain.com&#8221;, or whatever floats your boat.</li>
</ul>
<p>Either technique will give you the desired result of being able to handle multiple destination addresses with one mailbox. Which is great, because we&#8217;ll now use the PHP IMAP extension to grab those emails!</p>
<p>I should note here that you can always just make separate Gmail inboxes to handle your various accounts. I like the approaches above because you don&#8217;t have to create a new inbox every time you get a new client, but if your client base is slow-growing or you can&#8217;t use the methods above, making a separate inbox will certainly work.</p>
<p>The IMAP extension is stupidly simple to use, and I&#8217;m surprised I had never toyed with it before. I&#8217;m happy this project came up, because it was super fun to learn.</p>
<p>The one caveat is that the IMAP extension generally doesn&#8217;t ship with PHP. On Linux systems you can use your package manager to install it pretty easily, and on Mac all you have to do is compile the IMAP module separately (no need to recompile all of PHP) and install it into your modules directory.</p>
<pre class="code">
$imap = imap_open(
	IMAP_SERVER,
	IMAP_LOGIN,
	IMAP_PASSWORD
);
</pre>
<p>Server strings are the trickiest part of the above, and they look something like this:</p>
<pre class="code">
{imap.gmail.com:993/ssl}INBOX
</pre>
<p>Check out the <a href="http://www.php.net/manual/en/function.imap-open.php">PHP imap_open documentation</a> for details on the server string. It includes the server address, port, and options (the /ssl is <em>not</em> a path). You can also specify the default mailbox to connect to.</p>
<pre class="code">
$emails = imap_search($imap, 'ALL', SE_UID);
</pre>
<p>The imap_search function is the best way to loop through emails in a mailbox. The function returns a simple array of email IDs. The third parameter above, SE_UID (part of a bitmask), specifies that I want the unique email ID &#8212; otherwise it&#8217;ll return the regular mail ID. The distinction is this: every email in the account has a unique UID, but they also have a &#8220;mail ID&#8221; which is the same as &#8220;email # X of 1,200&#8243; (where X is the mail ID). I feel safer using the UID, since you can store the ID value in your database and rest assured that it will always reference a single email.</p>
<p>The second parameter above is amazing! Values can be things like &#8216;ALL&#8217;, &#8216;UNSEEN&#8217;, &#8216;UNANSWERED&#8217;, &#8216;UNSEEN FROM mom@website.com&#8217;, and all sorts of great, human-readable filters like that. Again, <a href="http://www.php.net/manual/en/function.imap-search.php">check out the imap_search documentation</a> for details.</p>
<p>The $emails variable now holds an array of UIDs. Keep in mind that IMAP uses the plain-ol&#8217; ID by default, so if you decide you want to UIDs you need to specify that in every imap_* function call.</p>
<pre class="code">
foreach ($emails as $uid)
{
	$overview  = imap_fetch_overview($imap, $uid, FT_UID);
	$headers   = imap_fetchbody($imap, $uid, 0, FT_UID);
	$plaintext = imap_fetchbody($imap, $uid, 1, FT_UID);
	$html      = imap_fetchbody($imap, $uid, 2, FT_UID);
}
</pre>
<p>Some important things to note:</p>
<p>The overview function can actually take a comma-separated list of IDs for the second argument. If you give it a list, it will return the overviews for that entire list of emails. This is great if you just want to get a quick glimpse at the subjects and from fields for the whole inbox &#8212; no need to download the whole message. This is one of the core philosophies of IMAP.</p>
<p>Note that when you call imap_fetchbody, the message <em>will be marked as seen</em> (or &#8220;unread&#8221; in Gmail terminology). To avoid this, use the FT_PEEK constant in the bitmask:</p>
<pre class="code">
$plaintext = imap_fetchbody($imap, $uid, 1, FT_UID | FT_PEEK);
</pre>
<p>Finally, $headers, $plaintext, and $html all use the imap_fetchbody function with the third parameter as either 0, 1, or 2. I don&#8217;t believe that any of these are guaranteed to exist, but I&#8217;m not sure there so be safe and double check.</p>
<p>If your application should sort emails by &#8220;to&#8221; address, use the overview to check out the &#8220;to&#8221; field. If your application makes use of CCs, however, you&#8217;ll need to look at the entire header section. The documentation says that the overview function is <em>supposed</em> to return multiple &#8220;to&#8221; and &#8220;cc&#8221; recipients, but my PHP 5.3.16 extension doesn&#8217;t do that. It may be a bug, or it may just be a 5.4 feature &#8212; I haven&#8217;t looked into it that deeply. While it&#8217;s slightly annoying that the overview function isn&#8217;t doing that properly, it&#8217;s also not that hard to parse the headers.</p>
<p>I&#8217;m very glad I got to play with this, and I&#8217;m happy to say that this will be appearing in the next release of Tidal&#8217;s dashboard in production. <3 IMAP.</p>
<p>The post <a href="http://burakkanber.com/blog/using-gmail-and-imap/">Using Gmail and IMAP to create an incoming email system</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/using-gmail-and-imap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Machine Learning: Introduction to Genetic Algorithms</title>
		<link>http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/</link>
		<comments>http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/#comments</comments>
		<pubDate>Tue, 04 Sep 2012 17:05:43 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=41</guid>
		<description><![CDATA[<p>I like starting my machine learning classes with genetic algorithms (which we'll abbreviate "GA" sometimes). Genetic algorithms are probably the least practical of the ML algorithms I cover, but I love starting with them because they're fascinating and they do a good job of introducing the "cost function" or "error function", and the idea of local and global optima -- concepts both important and common to most other ML algorithms.</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/">Machine Learning: Introduction to Genetic Algorithms</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em>The <a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/">Introduction to &#8220;Machine Learning in Javascript&#8221;</a> post provides a nice introduction and context for this post and the rest of the series.</em></p>
<p>I like starting my machine learning classes with genetic algorithms (which we&#8217;ll abbreviate &#8220;GA&#8221; sometimes). Genetic algorithms are probably the least practical of the ML algorithms I cover, but I love starting with them because they&#8217;re fascinating and they do a good job of introducing the &#8220;cost function&#8221; or &#8220;error function&#8221;, and the idea of local and global optima &#8212; concepts both important and common to most other ML algorithms.</p>
<p>Genetic algorithms are inspired by nature and evolution, which is seriously cool to me. It&#8217;s no surprise, either, that artificial neural networks (&#8220;NN&#8221;) are also modeled from biology: evolution is the best general-purpose learning algorithm we&#8217;ve experienced, and the brain is the best general-purpose problem solver we know. These are two very important pieces of our biological existence, and also two rapidly growing fields of artificial intelligence and machine learning study. While I&#8217;m tempted to talk more about the distinction I make between the GA&#8217;s &#8220;learning algorithm&#8221; and the NN&#8217;s &#8220;problem solver&#8221; terminology, we&#8217;ll drop the topic of NNs altogether and concentrate on GAs&#8230; for now.</p>
<p>One phrase I used above is profoundly important: &#8220;general-purpose&#8221;. For almost any specific computational problem, you can probably find an algorithm that solves it more efficiently than a GA. But that&#8217;s not the point of this exercise, and it&#8217;s also not the point of GAs. You use the GA not when you have a complex problem, but when you have a complex problem of problems. Or you may use it when you have a complicated set of disparate parameters.</p>
<p>One application that comes to mind is bipedal robot walking. It&#8217;s damn hard to make robots walk on two legs. Hand-coding a walking routine will almost certainly fail. Even if you succeed in making a robot walk, the next robot that comes off the line might have a slightly different center of balance, and that algorithm you slaved over no longer works. Instead of enduring the inevitable heartbreak, you might use a GA to &#8220;teach the robot to learn to walk&#8221; rather than simply &#8220;teaching the robot to walk&#8221;.</p>
<p>Let&#8217;s build a GA in Javascript.</p>
<h3>The Problem</h3>
<p>Build a genetic algorithm in Javascript that reproduces the text &#8220;Hello, World!&#8221;.</p>
<p>Naturally, everything starts with &#8220;Hello, World!&#8221; and so building a GA to reproduce that phrase is apropos. Note that this problem is highly contrived. At one point we&#8217;re even going to type the phrase &#8220;Hello, World!&#8221; into the source code! Now that seems silly &#8212; if you know the desired result, why program the algorithm in the first place? The answer is simple: this is a learning exercise. The next GA exercise (which will be in PHP) will be a little less contrived, but we need to start somewhere.</p>
<h3>Genetic Algorithm Basics</h3>
<p>The basic approach to GAs is to generate a bunch of &#8220;answer candidates&#8221; and use some sort of feedback to figure out how close the candidate is to optimal. Far-from-optimal candidates literally die and are never seen again. Close-to-optimal candidates combine with each other and maybe mutate slightly; this is an attempt to modify the candidates from time to time and see if they get closer to optimal or farther from optimal.</p>
<p>These &#8220;answer candidates&#8221; are called <strong><s>genes</s> chromosomes</strong>. (<em>Note: my terminology here was incorrect. Genes are technically the individual characters of the solution candidate, while the whole thing is called a chromosome. Semantics are important!</em>)</p>
<p>Chromosomes mate, produce offspring, and mutate. They either die due to survival of the fittest, or are allowed to produce offspring who may have more desirable traits and adhere to natural selection.</p>
<p>This may be a strange way to think about solving for &#8220;Hello, World!&#8221; but stick with it. This example isn&#8217;t the only problem that can be solved with GAs!</p>
<h3>The Chromosome</h3>
<p>The chromosome is a representation of a solution candidate. In our case, the chromosome itself is a string. Let&#8217;s assume that all chromosomes have a length of 13 characters (the same as &#8220;Hello, World!&#8221;). Here are some possible chromosomes that could be solution candidates for our problem:</p>
<ul>
<li>Gekmo+ xosmd!</li>
<li>Gekln, worle&#8221;</li>
<li>Fello, wosld!</li>
<li>Gello, wprld!</li>
<li>Hello, world!</li>
</ul>
<p>Obviously that last one is the &#8220;correct&#8221; (or globally-optimum) chromosome. But how do we measure the optimality of a chromosome?</p>
<h4>Cost Function</h4>
<p>The cost function (or error function, or fitness function as the inverse) is some sort of measure of the optimality of a chromosome. If we&#8217;re calling it &#8220;fitness function&#8221; then we&#8217;re shooting for higher scores, and if we&#8217;re using &#8220;cost function&#8221; then we&#8217;re looking for low scores.</p>
<p>In this case, we might define a cost function to be something like the following:</p>
<blockquote><p>For each character in the string, figure out the difference in ASCII representation between the candidate character and the target character, and then square it so that the &#8220;cost&#8221; is always positive.</p></blockquote>
<p>For example, if we have a capital &#8220;A&#8221; (ASCII 65) but it&#8217;s supposed to be a capital &#8220;C&#8221; (ASCII 67), then our cost for that character is 4 (67 &#8211; 65 = 2, and 2^2 = 4).</p>
<p>Again, the reason we&#8217;re using the square of the difference is so that we never end up with a negative cost. You could just use absolute value if you want, too. Please experiment with different approaches &#8212; that&#8217;s how you learn!</p>
<p>Using that rule as a cost function, we can calculate the costs of the above 5 example chromosomes (in parentheses):</p>
<ul>
<li>Gekmo+ xosmd! <strong>(7)</strong></li>
<li>Gekln, worle&#8221; <strong>(5)</strong></li>
<li>Fello, wosld! <strong>(5)</strong></li>
<li>Gello, wprld! <strong>(2)</strong></li>
<li>Hello, world! <strong>(0)</strong></li>
</ul>
<p>In this case, since this problem is easy and contrived, we know that we&#8217;re shooting for a cost of 0 and that we can stop there. Sometimes that&#8217;s not the case. Sometimes you&#8217;re just looking for the lowest cost you can find, and need to figure out different ways to end the calculation. Other times you&#8217;re looking for the <em>highest</em> &#8220;fitness score&#8221; you can find, and similarly need to figure out some other criteria to use to stop the calculation.</p>
<p>The cost function is a very important aspect of GAs, because if you&#8217;re clever enough, you can use it to reconcile <em>completely disparate parameters.</em> In our case, we&#8217;re just looking at letters. But what if you&#8217;re building a driving directions app and need to weigh tolls vs distance vs speed vs traffic lights vs bad neighborhoods vs bridges? Those are completely disparate parameters that you can reduce into one, neat, tidy cost function for a route by applying different weights to each parameter.</p>
<h4>Mating and Death</h4>
<p>Mating is a fact of life, and we use it tons in GAs. Mating is a magical moment that two chromosomes in love with each other share. The technical term for mating is &#8220;crossover&#8221;, but I&#8217;ll continue calling it &#8220;mating&#8221; here, because that paints a more intuitive picture.</p>
<p>We haven&#8217;t talked about &#8220;populations&#8221; in GAs yet (we&#8217;ll get to that a bit later) but for now I&#8217;ll just say that when you run a GA, you don&#8217;t just look at one chromosome at a time. You might have a population of 20 or 100 or 5,000 going all at once. Just like in evolution, you might be inclined to have the best and strongest chromosomes of the population mate with each other, with the hope that their offspring will be even healthier than either parent.</p>
<p>Mating strings, like in our &#8220;Hello, World!&#8221; example is pretty easy. You can pick two candidates (two strings; two chromosome) and pick a point in the middle of the string. This point can be dead-center if you want, or randomized if you prefer. Experiment with it! Take that middle point (called a &#8220;pivot&#8221; point), and make two new chromosomes by combining the first half of one with the second half of the other and vice versa.</p>
<p>Take these two strings for example:</p>
<ul>
<li>Hello, wprld! <strong>(1)</strong></li>
<li>Iello, world! <strong>(1)</strong></li>
</ul>
<div>Cutting them in half and making two new strings from the alternating halves gives us these two new &#8220;children&#8221;:</div>
<div>
<ul>
<li>Iello, wprld! <strong>(2)</strong></li>
<li>Hello, world! <strong>(0)</strong></li>
</ul>
</div>
<p>As you can see, the offspring includes one bastard child that has the worst traits of both parents, but also an angel child that has the best traits of both.</p>
<p>Mating is how you get from one generation of genes to the next.</p>
<h4>Mutation</h4>
<p>Mating alone has a problem: in-breeding. If all you do is mate your candidates to go from generation to generation, you&#8217;ll get stuck near a &#8220;local optimum&#8221;: an answer that&#8217;s <em>pretty good</em> but not necessarily the &#8220;global optimum&#8221; (the best you can hope for).</p>
<p>(<em>I was recently informed that the above paragraph is misleading. Some readers got the impression that the most important aspect of chromosome evolution is the mating, when in actuality a GA would achieve very little if not for the combined effects of both mating and mutation. The mating helps discover more optimal solutions from already-good solutions [many problems' solutions, like our Hello World problem, can be divided into optimal sub-solutions, like "Hello" and "world" separately], but it&#8217;s the mutation that pushes the search for solutions in new directions.</em>)</p>
<p>Think of the world that these genes are living in as a physical setting. It&#8217;s really hilly with all sorts of weird peaks and valleys. There is one valley that&#8217;s the lowest of all, but there are also tons of other little valleys &#8212; while these other valleys are lower than the land directly around them, they&#8217;re still above sea-level overall. Searching for a solution is like starting a bunch of balls on the hills in <em>random places</em>. You let the balls go and they roll downhill. Eventually, the balls will get stuck in the valleys &#8212; but many of them will get stuck in the random mini-valleys that are stilly pretty high up the hill (the local optima). It&#8217;s your job to make sure at least one of the balls ends up in the lowest point on the whole map: the global optimum. Since the balls all start in random places, it&#8217;s hard to do this from the outset, and it&#8217;s impossible to predict which ball will get stuck where. But what you <em>can </em>do is visit a bunch of the balls <em>at random</em> and give them a kick. Maybe the kick will help, maybe it&#8217;ll hurt &#8212; but the idea here is to shake up the system a little bit to make sure things aren&#8217;t getting stuck in local optima for too long.</p>
<p>This is called <strong>mutation</strong>. It&#8217;s a completely random process by which you target an unsuspecting chromosome and blast it with just enough radiation to make one of its letters randomly change.</p>
<p>Here&#8217;s an example to illustrate. Let&#8217;s say you end up with these two chromosomes:</p>
<ul>
<li>Hfllp, worlb!</li>
<li>Hfllp, worlb!</li>
</ul>
<p>Again, a contrived example, but it happens. Your two chromosomes are exactly the same. That means that their children will be exactly the same as the parents, and no progress will ever be made. But if 1 in 100 chromosomes has one letter randomly mutated, it&#8217;s only a matter of time before chromosome #2 above turns into &#8220;Ifllp, worlb!&#8221; &#8212; and then the evolution continues because the children will finally be different from the parents once again. Mutation pushes the evolution forward.</p>
<p>How and when you mutate is up to you. Again, experiment. The code I&#8217;ll give you later has a very high mutation rate (50%), but that&#8217;s really just for demonstration. You might make it low, like 1%. My code makes only <em>one letter</em> move by <em>one ASCII code</em> but you can have yours be more radical. Experiment, test, and learn. It&#8217;s the only way.</p>
<h4>Chromosomes: Summary</h4>
<p>Chromosomes are representations of candidate solutions to your problem. They consist of the representation itself (in our case, a 13-character string), a cost or fitness score and function, the ability to mate (&#8220;crossover&#8221;), and the ability to mutate.</p>
<p>I like thinking about these things in OOP terms. The &#8220;Chromosome&#8221; class therefore has the following properties:</p>
<p><strong>Properties:</strong></p>
<ul>
<li>Genetic code</li>
<li>Cost/fitness score</li>
</ul>
<p><strong>Methods:</strong></p>
<ul>
<li>Mate</li>
<li>Mutate</li>
<li>Calculate Fitness Score</li>
</ul>
<p>We&#8217;ll now look at how to have genes interact with each other in the final piece of the GA puzzle: the &#8220;Population&#8221;.</p>
<h3>The Population</h3>
<p>The population is a group of chromosomes. The population generally remains the same size but will typically evolve to better average cost scores over time.</p>
<p>You get to choose your population size. I picked 20 for mine below, but you could choose 10 or 100 or 10,000 if you want. There are advantages and disadvantages, but as I&#8217;ve said a few times by now: experiment and learn for yourself!</p>
<p>The population experiences &#8220;generations&#8221;. A typical generation may consist of:</p>
<ul>
<li>Calculating the cost/fitness score for each chromosome</li>
<li>Sorting the chromosome by cost/fitness score</li>
<li>Killing a certain number of the weakest members &#8212; you pick the number of chromosome that will die</li>
<li>Mating a certain number of the strongest members &#8212; again, you pick how you do this</li>
<li>Mutating members at random</li>
<li>Some kind of completeness test &#8212; ie, how do you determine when to consider the problem &#8220;solved&#8221;?</li>
</ul>
<h4>Starting and Finishing</h4>
<p>Starting a population is easy. Just fill it with completely random chromosomes. In our case, the cost scores for completely random strings will be <em>horrendous.</em> My code starts at an average cost score of 30,000. But that&#8217;s ok &#8212; that&#8217;s what evolution is for. This is why we&#8217;re here.</p>
<p>Knowing when to stop the population is a little trickier. Today&#8217;s example is pretty simple: stop when you get a cost of 0. But this isn&#8217;t always the case. Sometimes you don&#8217;t know the minimum achievable cost. Or, if you&#8217;re using fitness instead of cost, you may not know the maximum possible fitness.</p>
<p>In those cases you should specify a completeness criteria. This can be anything you want, but here&#8217;s a starting suggestion to jump off from:</p>
<blockquote><p>Stop the algorithm if the best score hasn&#8217;t changed in 1,000 generations, and use that as your answer.</p></blockquote>
<p>Criteria like that may mean that you never achieve the global optimum, but in many cases you don&#8217;t <em>need</em> to achieve the global optimum. Sometimes &#8220;close enough&#8221; really is good enough.</p>
<p>I&#8217;ll soon be writing another article on GAs (for PHP this time) with a slightly different problem, and that one will have a completeness rule similar to the above. It might be hard to swallow that &#8220;close enough is good enough&#8221; right now, but once you see the example in action hopefully you&#8217;ll believe me.</p>
<h3>The Code</h3>
<p>Finally! I like OOP methods, but I also like rough and simple code, so I&#8217;ll do this as straight-forwardly as possible though it may be a little rough around the edges.</p>
<p>(<em>Note that while I changed the occurrences of the term &#8220;gene&#8221; to &#8220;chromosome&#8221; in the text above, the code below still uses the incorrect &#8220;gene&#8221; terminology. It&#8217;s a semantic and pedantic difference, but where would we be without semantic pedantics?</em>)</p>
<pre class="code">var Gene = function(code) {
        if (code)
                this.code = code;
        this.cost = 9999;
};
Gene.prototype.code = '';
Gene.prototype.random = function(length) {
        while (length--) {
                this.code += String.fromCharCode(Math.floor(Math.random()*255));
        }
};</pre>
<p>Simple. It&#8217;s just a class that takes a string as a constructor, sets a cost, and has a helper function to create a new, random chromosome.</p>
<pre class="code">Gene.prototype.calcCost = function(compareTo) {
        var total = 0;
        for(i = 0; i &lt; this.code.length; i++) {
                total += (this.code.charCodeAt(i) - compareTo.charCodeAt(i)) * (this.code.charCodeAt(i) - compareTo.charCodeAt(i));
        }
        this.cost = total;
};</pre>
<p>The cost function takes the &#8220;model&#8221; string as an argument, finds the differences between ASCII codes, and squares them.</p>
<pre class="code">Gene.prototype.mate = function(gene) {
        var pivot = Math.round(this.code.length / 2) - 1;

        var child1 = this.code.substr(0, pivot) + gene.code.substr(pivot);
        var child2 = gene.code.substr(0, pivot) + this.code.substr(pivot);

        return [new Gene(child1), new Gene(child2)];
};</pre>
<p>The mating function takes another chromosome as an argument, finds the center point, and returns an array of two new children.</p>
<pre class="code">Gene.prototype.mutate = function(chance) {
        if (Math.random() &gt; chance)
                return;

        var index = Math.floor(Math.random()*this.code.length);
        var upOrDown = Math.random()</pre>
<p>The mutate method takes a float as an argument &#8212; the percent chance that the chromosome will mutate. If the chromosome is to mutate we randomly decide if we&#8217;re going to add or subtract one from the randomly-selected character code. I was going too fast to write a proper String.prototype.replaceAt method, so I just took an easy shortcut there.</p>
<pre class="code">var Population = function(goal, size) {
        this.members = [];
        this.goal = goal;
        this.generationNumber = 0;
        while (size--) {
                var gene = new Gene();
                gene.random(this.goal.length);
                this.members.push(gene);
        }
};</pre>
<p>The Population class constructor takes the target string and population size as arguments, then fills the population with random chromosomes.</p>
<pre class="code">Population.prototype.sort = function() {
        this.members.sort(function(a, b) {
                return a.cost - b.cost;
        });
}</pre>
<p>I define a Population.prototype.sort method as a helper function to sort the population by their cost score.</p>
<pre class="code">Population.prototype.generation = function() {
        for (var i = 0; i &lt; this.members.length; i++) {
                this.members[i].calcCost(this.goal);    
        }

        this.sort();
        this.display();
        var children = this.members[0].mate(this.members[1]);
        this.members.splice(this.members.length - 2, 2, children[0], children[1]);

        for (var i = 0; i &lt; this.members.length; i++) {
                this.members[i].mutate(0.5);
                this.members[i].calcCost(this.goal);
                if (this.members[i].code == this.goal) { 
                        this.sort();
                        this.display();
                        return true;
                }
        }
        this.generationNumber++;
        var scope = this;
        setTimeout(function() { scope.generation(); } , 20);
};</pre>
<p>The meatiest population method is the generation method. There&#8217;s no real magic here. The display() method (not shown on this page) just renders output to the page, and I set a timeout between generations so that things don&#8217;t explode.</p>
<p>Note that in this example I&#8217;m only mating the top two chromosomes. This doesn&#8217;t have to be your approach.</p>
<pre class="code">window.onload = function() {
        var population  = new Population("Hello, world!", 20);
        population.generation();
};</pre>
<p>That gets the ball rolling. See it in action:</p>
<p><iframe style="width: 100%; height: 600px; border: 1px solid #333;" src="http://jsfiddle.net/bkanber/BBxc6/embedded/" frameborder="0" width="320" height="240"></iframe></p>
<p>It&#8217;s slow &#8212; but it&#8217;s not too difficult to figure out where the inefficiencies are. If you&#8217;re clever enough you can certainly make it lightning fast. As always, I encourage you to fork and experiment and learn on your own.</p>
<p>Notice that there&#8217;s nothing in the above that can&#8217;t be done in <em>any</em> programming language. But you probably expected that, since this is &#8220;Machine Learning in All Languages&#8221;, after all.</p>
<p>Happy learning! The next post in this series is <a title="Machine Learning in JS: k-nearest-neighbor Introduction" href="http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/">ML in JS: k-nearest-neighbor</a>.</p>
<p>There&#8217;s also a <a title="Machine Learning: Genetic Algorithms in Javascript Part 2" href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/">Genetic Algorithms Part 2</a>, which you should read if you want to get a little more advanced.</p>
<p><!-- Begin MailChimp Signup Form --></p>
<div id="mc_embed_signup">
<form id="mc-embedded-subscribe-form" class="validate" action="http://burakkanber.us5.list-manage1.com/subscribe/post?u=3269dc566d381f05852d92aa8&amp;id=3f5451bda5" method="post" name="mc-embedded-subscribe-form" target="_blank">
<h2>Email me when new ML in JS articles are posted</h2>
<div class="mc-field-group"><label for="mce-EMAIL">Email Address </label><input id="mce-EMAIL" class="required email" type="email" name="EMAIL" value="" /></div>
<div class="mc-field-group input-group"><strong>Survey: Would you buy an ML in JS e-book? </strong><br />
<input id="mce-group[6613]-6613-0" type="checkbox" name="group[6613][1]" value="1" /><label for="mce-group[6613]-6613-0">I would pay $10 for a DRM-free e-book with tons of ML lessons and JS examples.</label></div>
<div id="mce-responses" class="clear"></div>
<div class="clear"><input id="mc-embedded-subscribe" class="button" type="submit" name="subscribe" value="Subscribe" /></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/">Machine Learning: Introduction to Genetic Algorithms</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Machine Learning in Javascript: Introduction</title>
		<link>http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/</link>
		<comments>http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/#comments</comments>
		<pubDate>Mon, 03 Sep 2012 19:38:24 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=38</guid>
		<description><![CDATA[<p>I love machine learning algorithms. I&#8217;ve taught classes and seminars and given talks on ML. The subject is fascinating to me, but like all skills fascination simply isn&#8217;t enough. To get good at something, you need to practice! I also happen to be a PHP and Javascript developer. I&#8217;ve taught classes on both of these...</p><p>The post <a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/">Machine Learning in Javascript: Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>I love machine learning algorithms. I&#8217;ve taught classes and seminars and given talks on ML. The subject is fascinating to me, but like all skills fascination simply isn&#8217;t enough. To get good at something, you need to practice!</p>
<p>I also happen to be a PHP and Javascript developer. I&#8217;ve taught classes on both of these as well &#8212; but like any decent software engineer I have experience with Ruby, Python, Perl, and C. I just prefer PHP and JS. Before you flame PHP, I&#8217;ll just say that while it has its problems, I like it because it gets stuff done.</p>
<p>Whenever I say that Tidal Labs&#8217; ML algorithms are in PHP, they look at me funny and ask me how it&#8217;s possible. Simple: it&#8217;s possible to write ML algorithms in just about any language. Most people just don&#8217;t care the learn the fundamentals strongly enough that they can write an algorithm from scratch. Instead, they rely on Python libraries to do the work for them, and end up not truly grasping what&#8217;s happening inside the black box. Other people only know MLs academically, using Octave or Matlab.</p>
<p>Through this series of articles, I&#8217;ll teach you the fundamental machine learning algorithms using Javascript &#8212; not Python or Octave &#8212; as the example language. Originally I intended to write these articles in a variety of languages (PHP, JS, Perl, C, Ruby), but decided to stick with Javascript for the following reasons:</p>
<ul>
<li>If you&#8217;re a web developer you probably already know JS, regardless of your backend expertise.</li>
<li>Javascript has JSFiddle, a great tool that lets me embed executable Javascript right in my posts (hard to do that with C or Perl!)</li>
<li>Several people asked me to stick to just one language.</li>
</ul>
<p>While I&#8217;ll be writing these articles with Javascript in mind, <em>please</em> re-write the examples in your language of choice as homework! Practice is how you get better, and writing the same algorithm several times in different languages really helps you understand the paradigms better.</p>
<p>It&#8217;s possible to get excellent performance out of ML algorithms in languages like PHP and Javascript. I advocate writing ML algorithms in other languages because the practice of writing ML algorithms from scratch help you learn them fundamentally, and it also helps you unify your backend by not requiring a Python script to do processing in the middle of a PHP application. You can do it in PHP, and cut out the (mental and computational) overhead of using another language.</p>
<p>&#8230; well, most of the time. There are some things you really can&#8217;t do in PHP or Javascript, but those are the more advanced algorithms that require heavy matrix math. While you <em>can </em>do matrix math in JS, there is a big difference between simply &#8220;doing matrix math&#8221; and doing it efficiently. The advantage of NumPy or Matlab is not in their ability to do matrix operations, it&#8217;s in the fact that they use optimized algorithms to do so &#8212; things you wouldn&#8217;t be able to do yourself unless you dedicate yourself to learning computational linear algebra. And that&#8217;s not my field, so we&#8217;ll just stick to the ML that doesn&#8217;t require matrix math. You could try brute-forcing the matrix operations, but you&#8217;ll end up with a relatively inefficient system. It&#8217;s great for learning, so I&#8217;m not discouraging it &#8212; I would just be wary of doing that in a production environment.</p>
<p>Keep in mind that most of the algorithms we&#8217;ll look at can be solved both with and without matrix math. We&#8217;ll use iterative or functional approaches here, but most of these algorithms can be done with linear algebra as well. There&#8217;s more than one way to skin a cat! I encourage you to also go and learn (or figure out) the linear algebra approaches, but since that&#8217;s not my strong suit I&#8217;ll use other approaches.</p>
<p>Here are some of the algorithms I intend to cover. I&#8217;ll update this list with links to the relevant articles as they&#8217;re published:</p>
<ul>
<li>k-nearest-neighbor (<a title="Machine Learning in JS: k-nearest-neighbor Introduction" href="http://burakkanber.com/blog/machine-learning-in-js-k-nearest-neighbor-part-1/">Introduction</a>)</li>
<li>k-means clustering (<a title="Machine Learning: k-Means Clustering in Javascript Part 1" href="http://burakkanber.com/blog/machine-learning-k-means-clustering-in-javascript-part-1/" target="_blank">Part 1</a>)</li>
<li>Genetic algorithms (<a href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-part-1-javascript/">Part 1</a>, <a title="Machine Learning: Genetic Algorithms in Javascript Part 2" href="http://burakkanber.com/blog/machine-learning-genetic-algorithms-in-javascript-part-2/">Part 2</a>)</li>
<li>Naive Bayes classifier (<a href="http://burakkanber.com/blog/machine-learning-naive-bayes-1/">Part 1: Document Classification</a>)</li>
<li>Neural network</li>
</ul>
<p>Happy learning!</p>
<p>The post <a href="http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/">Machine Learning in Javascript: Introduction</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/machine-learning-in-other-languages-introduction/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Staying productive while working from home, or: why we need workspaces.</title>
		<link>http://burakkanber.com/blog/staying-productive-while-working-from-home-or-why-we-need-workspaces/</link>
		<comments>http://burakkanber.com/blog/staying-productive-while-working-from-home-or-why-we-need-workspaces/#comments</comments>
		<pubDate>Wed, 20 Jun 2012 04:52:15 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=23</guid>
		<description><![CDATA[<p>I&#8217;ll put the TL;DR at the begging of this post: do you sometimes have trouble motivating yourself to work while working from home? Two simple changes will fix that. Set up a workspace, and work there every day. Do not, ever, play or relax there. Your workspace could be a coffee shop, or it could...</p><p>The post <a href="http://burakkanber.com/blog/staying-productive-while-working-from-home-or-why-we-need-workspaces/">Staying productive while working from home, or: why we need workspaces.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll put the TL;DR at the begging of this post: do you sometimes have trouble motivating yourself to work while working from home? Two simple changes will fix that.</p>
<ol>
<li>Set up a workspace, and work there every day. Do not, ever, play or relax there. Your workspace could be a coffee shop, or it could be the kitchen table. (Corollary: also set up a relaxation space. Don&#8217;t ever work there. This could be your entire home if your workspace is a coffee shop, or it could be &#8220;living room couch&#8221; if your workspace is the kitchen table.)</li>
<li>Set up a rigid time to physically pick yourself up and move to your workspace. If your workspace is the kitchen table, all you have to do is <em>get there</em> on time and the work will flow naturally.</li>
</ol>
<p>Here&#8217;s the whole story:</p>
<p>It&#8217;s been a pretty long time since I&#8217;ve worked out of an office. I left my IT engineering job in college and started consulting as a software engineer&#8211;no offices there. Eventually I ended up as CTO of Tidal Labs, and although we have a nice office I continue to work from home most days (I&#8217;m more productive as a builder working from home).</p>
<p>But getting motivated and staying productive can sometimes be tricky. Almost everybody I know who works from home has complained of the same problem: some days it&#8217;s just too distracting, too tempting to <em>not </em>work. It&#8217;s hard to get in the zone and it&#8217;s too easy to turn on the TV and relax.</p>
<p>I&#8217;ve heard a lot of people say &#8220;I can&#8217;t work from home, I need an office&#8221; and the like. I used to believe it was just weakness or laziness, but after watching so many people I respect succumb to the same home-office problems, I decided to keep an open mind and put some thought to the problem.</p>
<p>The answer, in short: it has nothing to do with laziness or weakness. We <em>need</em> a dedicated workspace, but that doesn&#8217;t need to be an office <em>per se</em>.</p>
<p>Here&#8217;s how I arrived at that conclusion: I decided to actively experiment with productivity techniques, and I decided to be more observant of other people and how they deal with this type of thing (which was also part of my quest to become a better manager). I started tinkering around with productivity and these observations a little over a year ago, using myself as a guinea pig.</p>
<p><em>I should stop and clarify now that this is an anecdotal case-study, not a scientific paper. The last post I wrote about cooling down a room with ice came under a lot of scrutiny for a blog post&#8211;so reader beware, this is just a damned blog post, not a study.</em></p>
<p>One of the first things I&#8217;ve noticed is that <em>lots </em>of people, even those who generally have great work ethic, have succumbed to at-home laziness. I found that strange at first, though. Doesn&#8217;t having great work ethic mean you&#8217;re good at getting to work? It also puzzled me since my father started his company working from home and he was SO productive. So I had to rule out laziness as a factor. I mean, sure, laziness is a factor in &#8220;at-home laziness&#8221;, but I&#8217;m trying to figure out why otherwise great workers get lazy at home. If you&#8217;re lazy in general, you&#8217;re probably a lost cause.</p>
<p>If it&#8217;s not laziness, then it must be something else. I, too, have seen my at-home work ethic suffer some days (especially during creative work)&#8211;and I&#8217;m definitely not lazy. And then things started falling into place: realizing that my dad worked from home but had a <em>home office</em>. Realizing that the people who were lazy at home didn&#8217;t have home offices. Realizing that working out of coffee shops always seems to boost productivity. Realizing that there&#8217;s real truth to the phrase &#8220;I need to work out of an office.&#8221;</p>
<p>It&#8217;s the location! That sounds so simple, but DUH: being at home makes you lazy.</p>
<p>Wait, isn&#8217;t that what we started this post with? Isn&#8217;t the problem about being lazy while working from home? Yes, but now it has a different meaning: your home, itself, is <em>making </em>you lazy. You&#8217;re not a lazy person and &#8220;working from home&#8221; isn&#8217;t the problem, it&#8217;s the &#8220;home&#8221; part that&#8217;s the problem.</p>
<p>Our minds compartmentalize data to physical locations. That&#8217;s part of the reason you tend to walk into a room and forget why you&#8217;re there&#8211;walking through a door actually switches your brain from &#8220;mode: living room&#8221; to &#8220;mode: bedroom&#8221; (<a href="http://www.scientificamerican.com/article.cfm?id=why-walking-through-doorway-makes-you-forget">SciAm</a>, <a href="http://www.freakonomics.com/2011/11/23/does-walking-through-doorways-cause-forgetfulness/">Freakonomics</a>). Sometimes, if you have a piece of information that was generated in &#8220;mode: living room&#8221; and then you walk into your bedroom to act on that information, you&#8217;ll forget as soon as you walk through the door. That information is not available in &#8220;mode: bedroom&#8221;.</p>
<p>I believe the same thing happens when working from home. Your home is programmed into your brain as a space of relaxation. The office is for work. People feel productive at the office because the office is programmed in their brains as a place of work, not as a place of relaxation.</p>
<p>So to overcome the &#8220;at-home laziness&#8221;, all you have to do is not actually work from your home while you&#8217;re working from home. Go to a coffee shop. If you go to a coffee shop every day and work (and don&#8217;t spend any time relaxing there), that coffee shop will eventually trigger your &#8220;work reflex&#8221; in your brain, and you&#8217;ll find that it&#8217;s easier to start working when you&#8217;re there.</p>
<p>And setting rigid times to go to your workspace also helps. The hardest part about working from home is just <em>starting </em>(ever notice that once you start, it&#8217;s really easy to keep going?). By forcing yourself to <em>be </em>at your workspace at a given time, you&#8217;re also helping yourself start work at that time. This is why offices are so effective. By forcing employees to come in at 9am you&#8217;re forcing them to just <em>be there</em>. And by forcing them to be there at the office, you&#8217;re forcing their brains into work mode.</p>
<p>All you have to do to maintain productivity is to do that on your own. Say to yourself, &#8220;I&#8217;m going to the coffee shop at 11am.&#8221; Don&#8217;t even say &#8220;I&#8217;m going to work at 11am&#8221;&#8211;throwing the word &#8220;work&#8221; in there might be overwhelming. All you have to do is physically get yourself to the coffee shop (an easy commitment that doesn&#8217;t directly involve the work you&#8217;re trying to avoid), but once you&#8217;re there you&#8217;ll magically end up in work mode.</p>
<p>The logical extension of this is to (once you have the discipline) eventually drop the coffee shop and set that up in your own home. My relaxation space is the living room couch, and my work space is my desk. I don&#8217;t ever relax at my desk. I don&#8217;t watch TV on the computer there, I don&#8217;t Facebook or Reddit there either (well maybe sometimes). When I wake up in the morning, I go to the living room couch and drink coffee and wake up for a bit. If I try to stay on the couch and work, it doesn&#8217;t happen. But if I pick myself up and move into the other room, and sit down at my work desk, BAM I&#8217;m in work mode and ready to go. I&#8217;ve been trained to work at my desk, and my productivity levels are through the roof for the past year because of that one little change.</p>
<p>For people just starting to work at home, complaining of the at-home laziness, I do recommend making your workspace somewhere external like a coffee shop. The boundaries are more clear-cut if you have to leave the house for it, which makes it easier to start mentally setting up for your new work/relax locale separation.</p>
<p>My new developer, who&#8217;s in Seattle, recently had this very problem. He felt his productivity was low and since our offices are in NYC (hell of a commute from Seattle), I told him to work out of the coffee shop instead of trying to work from home. It&#8217;s such a simple life-hack, but the productivity boost is undeniable. He&#8217;s been feeling good about productivity lately, and the morale boost is deserved as well.</p>
<p>If you&#8217;re used to sitting at the desk and Redditing all day, of <em>course</em> you&#8217;re not going to get any work done if you try to work from that same desk. But if you set aside your desk for work and assign your couch to Reddit, I guarantee you&#8217;ll see a boost in your productivity. Just as long as you stay at the desk. Once you visit that relaxation couch, all bets are off.</p>
<p>The post <a href="http://burakkanber.com/blog/staying-productive-while-working-from-home-or-why-we-need-workspaces/">Staying productive while working from home, or: why we need workspaces.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/staying-productive-while-working-from-home-or-why-we-need-workspaces/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cooling a room with 2 liters of ice: calculation.</title>
		<link>http://burakkanber.com/blog/cooling-a-room-with-2-liters-of-ice-calculation/</link>
		<comments>http://burakkanber.com/blog/cooling-a-room-with-2-liters-of-ice-calculation/#comments</comments>
		<pubDate>Tue, 19 Jun 2012 04:38:19 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=16</guid>
		<description><![CDATA[<p>Lately, while I&#8217;ve been waiting for my new AC unit to arrive, I&#8217;ve been freezing a 2L bottle of water and putting it in front of my window fan to cool down my living room in the evening. Anecdotally, it seems to work&#8230; but I&#8217;ve been wondering if it&#8217;s actually working or if there&#8217;s something...</p><p>The post <a href="http://burakkanber.com/blog/cooling-a-room-with-2-liters-of-ice-calculation/">Cooling a room with 2 liters of ice: calculation.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>Lately, while I&#8217;ve been waiting for my new AC unit to arrive, I&#8217;ve been freezing a 2L bottle of water and putting it in front of my window fan to cool down my living room in the evening.</p>
<p>Anecdotally, it seems to work&#8230; but I&#8217;ve been wondering if it&#8217;s actually working or if there&#8217;s something else going on. Maybe it&#8217;s just the cooler evening air combined with my idleness in the evenings that&#8217;s cooling me down. Fortunately, the calculations are quick and relatively painless. Like most (ok, maybe not most, but a lot) of engineering, the difficulty is not in the arithmetic but rather the set-up of the problem, and knowing which assumptions to make and what formulae to use.</p>
<p><strong>The Set-Up</strong></p>
<p>We have a 2L bottle of water. Two liters of water has a mass of 2 kg. The whole bottle gets frozen and then placed in front of a window fan. Lets assume that the ice, as soon as it&#8217;s placed in front of my fan, is -10 C (14 F). We&#8217;ll assume ambient temperate is 25C (77F) &#8212; but this ends up not mattering because I never let the water bottle warm to room temperature anyway. Instead, let&#8217;s assume that the bottle warms to only about 10 C (50 F) before I remove it from the window. Assume that the efficiency of heat transfer from the bottle to the room is 100%.</p>
<p>One final assumption: let&#8217;s assume that we don&#8217;t care that the kitchen will heat up due to extra work the refrigerator has to do to freeze the water.</p>
<p>The room itself measures (and I&#8217;m mixing units here, I know) 20&#8242; by 12&#8242; by 8&#8242;&#8211;we&#8217;ll need the mass of that volume of air though, so let&#8217;s just ask Wolfram Alpha. The site reports that my living room has 72 kg of air in it.</p>
<p>So the 2kg bottle of ice starts at -10 C and warms to +10 C in a room of 72 kg air. The question we&#8217;re looking to answer is: how much can that block of ice cool the room? As a corollary to that (since we know that the following won&#8217;t be terribly accurate, and it won&#8217;t apply to <em>your </em>living room either), we can more generally ask: <strong>will a 2L ice block noticeably cool a room, or would it only change the temperature by a silly amount, like a tenth of a degree or less?</strong></p>
<p><strong>Heat Capacity</strong></p>
<p>Thermodynamics has a property called &#8220;heat capacity.&#8221; Heat capacity is the amount of energy required to change the temperature of a substance by one degree (in whatever scale you&#8217;re using). Slightly more useful is the &#8220;specific heat capacity&#8221;, or &#8220;specific heat&#8221; for short, which is the amount of energy required to change the temperature <em>of one unit mass</em> of a substance by one degree. That distinction is important: heat capacity is how much energy you need to change a whole block of a substance, but substances come in different shapes and sizes, so it&#8217;s more useful to use the specific heat capacity and then calculate the &#8220;overall&#8221; heat capacity if you know the mass of the thing you&#8217;re looking at.</p>
<p>Anyways, we can look up the specific heat of ice and find out how much energy it takes to bring our 2kg of ice from -10 C to 0 C. We need to do this in steps because <em>melting</em> ice also has its own energy requirement (called latent heat). Because we have that phase change in the middle, our ice block actually does three different things in sequence:</p>
<ol>
<li>warms from -10 C to 0 C (as ice the whole time),</li>
<li>melts completely while remaining at 0 C (as a mix of ice and water the whole time),</li>
<li>and then warms from 0 C to 10 C (as liquid water the whole time).</li>
</ol>
<p>The specific heat of ice at -10 C is 2.1 kJ / kg * K (from <a href="http://en.wikipedia.org/wiki/Heat_capacity#Table_of_specific_heat_capacities">Wikipedia</a>). Those units are &#8220;kilojoules per kilogram-degree Kelvin&#8221;&#8211;and again, that&#8217;s the amount of energy (kilojoules) it takes to warm a kilogram of ice one degree Kelvin (since we&#8217;re only concerned with relative temperatures, we can freely substitute Kelvin with Celsius. When I write out units or equations I&#8217;ll use K, but when I&#8217;m describing the problem scenario I&#8217;ll use C in this post).</p>
<p>To calculate how much energy goes into warming the ice to 0 C, we&#8217;ll just multiply our specific heat by the temperature difference and the mass. Here&#8217;s the general formula:</p>
<p>ΔE = m * Cp * ΔT</p>
<p>Plugging in values, we get:</p>
<p>( 2.1 kJ / kg * K ) * ( 2 kg ) * ( 10 C ) = <strong>42 kJ</strong></p>
<p>(Note that the equation above is a little rearranged&#8211;I put down the numbers in this order: Cp * m * ΔT = ΔE)</p>
<p>Warming our ice from -10 C to 0 C requires 42,000 Joules of energy. That energy comes from the air in the room, and when the air in the room loses that energy, it cools down.</p>
<p>Let&#8217;s see how much energy it takes to warm the water from 0 C to 10 C. The specific heat of water at that temperature is roughly 4.2 kJ / kg * K:</p>
<p>( 4.2 kJ / kg * K ) * ( 2 kg ) * ( 10 C ) = <strong>84 kJ</strong></p>
<p>Finally, we need to figure out how much energy goes into converting our ice to water. This property is called the &#8220;latent heat&#8221; (different from heat capacity), and the term &#8220;latent&#8221; is used there because the temperature of a substance stays the same during a phase transition. Phase transitions (like ice to water, or water to steam) stay at a constant temperature, and also usually require a lot more energy than raising the temperature. I wouldn&#8217;t be surprised if the energy required to melt the ice is several times larger than the two values above.</p>
<p>The &#8220;latent heat of fusion&#8221; (fusion is the ice -&gt; water or water -&gt; ice transition) for water is 334 kJ / kg. Note that there&#8217;s no degree Kelvin dimension there; that&#8217;s because the temperature won&#8217;t be changing during this calculation. Therefore, the energy required to melt our ice is:</p>
<p>ΔE = Lf * m</p>
<p>Or, plugging in numbers (again, rearranged to have ΔE on the right side):</p>
<p>( 334 kJ / kg ) * ( 2 kg ) =<strong> 668 kJ</strong></p>
<p>As we guessed earlier, that figure is much larger than the two above. Almost all the heat that goes into warming our ice is taken at the phase transition, and for that reason many engineers will simply ignore the two other steps in the sequence because they&#8217;re an order of magnitude less important. We&#8217;ll keep them in, just to be thorough.</p>
<p>Add up the energies required by the three steps above, and we get the total energy required to warm our ice:</p>
<p>668 kJ + 84 kJ + 42 kJ = <strong>794 kJ</strong>.</p>
<p>Let&#8217;s just call it 800 kJ.</p>
<p>Finally, let&#8217;s see how much our 800 kJ will change the temperature of 72 kg of air. The specific heat of air is 1.0 kJ / kg * K. We have 72 kg of the stuff in our room, so all we have to do is rearrange our first formula above to solve for ΔT:</p>
<p>ΔT = ΔE / ( m * Cp )</p>
<p>Plugging it in:</p>
<p>ΔT = 800 kJ / { ( 72 kg ) * ( 1.0 kJ / kg * K) } = <strong>11 C</strong></p>
<p>(Again, Celsius and Kelvin are interchangeable here).</p>
<p>What a result! The math says that a single 2L block of ice is capable of cooling a room down by 11 C (20 F)!</p>
<p>Now, there are other things going on that are stopping my room from turning into a refrigerator. For one, my living room isn&#8217;t an air-tight box, and we&#8217;re probably talking more along the lines of 4,000 cubic feet of air (from the hallways, attached kitchen, etc) that need to be cooled. Additionally, this process certainly isn&#8217;t 100% efficient. I don&#8217;t even have a good guess towards the efficiency of the effective heat transfer from the bottle to the room&#8211;it could be anywhere from 50% to 80% (surely lots of cool air is just bouncing right back out of the window, or is cooling the back of my couch). Assuming 50% efficiency already cuts us down to a 5 C (9 F) difference, and doubling the volume of air cuts us down further to 2.5 C (5 F).</p>
<p>But still, a 2.5 C (5 F) cool-down in the middle of the summer without an AC unit is absolutely worth the minimal effort it takes to freeze a 2L bottle of water during the day and stick it in front of the fan at night. It absolutely works, and you just saw the math that proves it! Even if the calculations above have an 100% error margin, it&#8217;s still a significant result&#8211;because the real goal of this exercise was to see if the frozen bottle would cool a room something significant like 1-10 degrees, or something insignificant like 0.01 degree.</p>
<p>So I guess I should re-word my conclusion: the above order-of-magnitude calculation shows that a frozen 2L bottle can noticeably cool a small or moderately sized room.</p>
<p><strong>Edit</strong>: a friend asked if I had considered the heat given off by a human body. Several Google sources say that a human gives off ~ 100 W (one Watt = one Joule per second). If you sit in a room for an hour, you&#8217;ll provide 6,000 J energy to the room &#8212; or a temperature increase of only 0.1 C! This leads us to another interesting conclusion: turning off the lights (if you only have one in the room) doesn&#8217;t reaaaaallly make it any cooler. If you have 10 bulbs, that&#8217;s a different story&#8230; but the 60 or 100 W bulb doesn&#8217;t make much of a difference in a large room.</p>
<p>The post <a href="http://burakkanber.com/blog/cooling-a-room-with-2-liters-of-ice-calculation/">Cooling a room with 2 liters of ice: calculation.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/cooling-a-room-with-2-liters-of-ice-calculation/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A little about this blog.</title>
		<link>http://burakkanber.com/blog/a-little-about-this-blog/</link>
		<comments>http://burakkanber.com/blog/a-little-about-this-blog/#comments</comments>
		<pubDate>Thu, 14 Jun 2012 03:33:16 +0000</pubDate>
		<dc:creator>bkanber</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://burakkanber.com/blog/?p=4</guid>
		<description><![CDATA[<p>This blog isn&#8217;t really meant for human consumption &#8212; I&#8217;ll be using it just to keep my thoughts in order and perhaps to host writeups of studies I&#8217;m doing. We&#8217;ll see how it evolves, but I doubt I&#8217;ll use this often anyway. Wish me luck writing, and I&#8217;ll wish you luck reading!</p><p>The post <a href="http://burakkanber.com/blog/a-little-about-this-blog/">A little about this blog.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></description>
			<content:encoded><![CDATA[<p>This blog isn&#8217;t really meant for human consumption &#8212; I&#8217;ll be using it just to keep my thoughts in order and perhaps to host writeups of studies I&#8217;m doing. We&#8217;ll see how it evolves, but I doubt I&#8217;ll use this often anyway.</p>
<p>Wish me luck writing, and I&#8217;ll wish you luck reading!</p>
<p>The post <a href="http://burakkanber.com/blog/a-little-about-this-blog/">A little about this blog.</a> appeared first on <a href="http://burakkanber.com/blog">Burak Kanber&#039;s Blog</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://burakkanber.com/blog/a-little-about-this-blog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
