Machine Learning in Javascript: Introduction

I love machine learning algorithms. I've taught classes and seminars and given talks on ML. The subject is fascinating to me, but like all skills fascination simply isn't enough. To get good at something, you need to practice!

I also happen to be a PHP and JavaScript developer. I've taught classes on both of these as well -- but like any decent software engineer I have experience with Ruby, Python, Perl, and C. I just prefer PHP and JS. (Before you flame PHP, I'll just say that while it has its problems, I like it because it gets stuff done.)

Whenever I say that Tidal Labs' ML algorithms are in PHP, they look at me funny and ask me how it's possible. Simple: it's possible to write ML algorithms in just about any language. Most people just don't care to learn the fundamentals strongly enough that they can write an algorithm from scratch. Instead, they rely on Python libraries to do the work for them, and end up not truly grasping what's happening inside the black box. Other people only know ML academically, using Octave or Matlab.

Through this series of articles, I'll teach you the fundamental machine learning algorithms using Javascript -- not Python or Octave -- as the example language. Originally I intended to write these articles in a variety of languages (PHP, JS, Perl, C, Ruby), but decided to stick with Javascript for the following reasons:

  • If you're a web developer you probably already know JS, regardless of your backend expertise.
  • Javascript has JSFiddle, a great tool that lets me embed executable Javascript right in my posts (hard to do that with C or Perl!)
  • Several people asked me to stick to just one language.
While I'll be writing these articles with Javascript in mind, please re-write the examples in your language of choice as homework! Practice is how you get better, and writing the same algorithm several times in different languages really helps you understand the paradigms better.

It's possible to get excellent performance out of ML algorithms in languages like PHP and Javascript. I advocate writing ML algorithms in other languages because the practice of writing ML algorithms from scratch helps you learn them fundamentally, and it also helps you unify your backend by not requiring a Python script to do processing in the middle of a PHP application. You can do it in PHP, and cut out the (mental and computational) overhead of using another language.

... well, most of the time. There are some things you really can't do in PHP or Javascript, but those are the more advanced algorithms that require heavy matrix math. While you can do matrix math in JS, there is a big difference between simply "doing matrix math" and doing it efficiently. The advantage of NumPy or Matlab is not in their ability to do matrix operations, it's in the fact that they use optimized algorithms to do so -- things you wouldn't be able to do yourself unless you dedicate yourself to learning computational linear algebra. And that's not my field, so we'll just stick to the ML that doesn't require the advanced matrix math. You could try brute-forcing the matrix operations, but you'll end up with a relatively inefficient system. It's great for learning, so I'm not discouraging it -- I would just be wary of doing that in a production environment.

Keep in mind that most of the algorithms we'll look at can be solved both with and without matrix math. We'll use iterative or functional approaches here, but most of these algorithms can be done with linear algebra as well. There's more than one way to skin a cat! I encourage you to also go and learn (or figure out) the linear algebra approaches, but since that's not my strong suit I'll use other approaches.

Here are some of the algorithms I intend to cover. I'll update this list with links to the relevant articles as they're published:

Happy learning!