CommonJS in the Browser

I’ve been thinking a lot lately about how to use CommonJS modules in my web applications. I even started a repository on github for my implementation. As is apparent from searching, the task is non-trivial, and there are lots of people trying to do the same thing, and every one of them has a different idea about how it should work.

But WHY would you want to use CommonJS (formerly known as ServerJS) modules in a client environment?

Ideally you can share modules between client and server, but that requires you to use a server environment like node.js, which might make management really nervous. Even without sharing the CommonJS module system helps us avoid some annoyances in JavaScript development.

  • Each module has it’s own scope. I don’t have to manually wrap each file in a function to get a new variable scope. (Of course, to achieve this, the boilerplate is going to have to wrap each module’s code in a function anyway.)
  • Namespaces are only used in the require function, not everywhere in my code. Almost inevitably every web application I’ve worked in ends up using code like the following:
        var whatIWanted = new FormerCompanyName.Common.CoolLibrary.ConstructorName( More.namespace.chains.than.you.can.follow );
        // the rest of this file continues to use these ridiculously long namespaces
    

    Although I’m sure many will disagree with me, I much prefer the CommonJS way:

        var CoolModule = require('common/cool-library'),
            thingINeed = require('more/namespace/chains/than/you/can/follow'),
            whatIWanted = new CoolModule.ConstructorName(thingINeed);
        // the rest of the file is void of long namespaces
    

    And much more importantly, when I define a new module (or class as some insist on calling them):

        FormerCompanyName.Common.CoolLibrary.ConstructorName = function() {/* ... */};
        // versus
        exports.ConstructorName = function() {/* ... */};
        // or even
        module.exports = function() {/* ... */} // this case isn't in the spec, but I really like it, so I made sure my library can handle it.
    
  • Because you can also use relative module identifiers (“./sibling-module”, “../uncle-module”), when the company changes it’s name, it can be as simple as renaming a folder to update all the top-level module ids.
  • Additionally, modules can be included in the page in any order, and are only executed when first required, instead of all modules executing immediately upon inclusion, requiring the script order to be specific and fragile. If I add a new module using CommonJS, I can just append it to the end of the list, otherwise I have to make sure it is earlier in the page than whatever uses it, and after whatever it uses.

Okay, but how much work is it going to be?

Let’s walk through first what I wanted my server-side code to look like, then what it has to do to make it work on the other side.

As most of my server-side experience thus far has been in php, that’s the first language I’ve used in my implementation.

<!DOCTYPE html>
<html>
<head>
    <title>My Awesome Application</title>
    <link rel="stylesheet" href="awesome-styles.css" />
</head>
<body>
    <!-- blah blah blah -->
    <?= Modules::script() /* include all necessary script tags */ ?>
    <script>require('awesome').go()</script>
</body>
</html>

The Modules class will look for all js files in the folder you put it in, and any subfolders, and will id them by their path.

Yes, I am including every module, not actually checking dependencies. I refer you back to my previous post and say this is the simplest way, and if the caching headers are working, the experience won’t suffer. You are welcome to use one of the fantastic libraries that loads modules on-demand, if you disagree.

Hopefully that is all the server-side API you need to worry about, but there is more if you need it.

So what is that library doing to my poor scripts to make the CommonJS module environment?

I will explain in detail what goes into it in another post, but if you are daring, you can check out the source on github.

On Pattern Hating

I have long considered myself a Java hater. I now think it really has nothing to do with the language itself. Sure it was easy to point at slow performance (hasn’t been true for a long time now), or mourn for missing syntactic sugar (Pattern.compile(‘abc’, Pattern.CASE_INSENSITIVE) vs /abc/i), but really I think my problem with Java is really just a problem with the mindset I have observed in novice programmers (with Java usually being their first language).

The problem is with patterns.

Patterns are great. They provide a toolbox that can lead developers on the road to “best practice”. But…

Patterns are a poor substitute for problem solving.

It doesn’t matter if you know how to make a Singleton, even if you know when a Singleton is useful, if the problem at hand is improving report speed. You need to know math, you need to know computation, and you need to find the unnecessary work being done. It’s possible we’ll use a Singleton, but it won’t be the solution to the problem.

In an interview, if I ask for code to find the most common words in a bunch of text files, “public class WordRanker {” is unimportant. I’ve seen a few programmers struggle for the first few minutes to figure out if it should be a class, a function, or what language to use. But once, I was impressed by someone who quickly figured out what they wanted to do, and then said, “I’d google how to do that.”

The pattern is accidental complexity. Problem solving is essential complexity.

Some thoughts on Web 4.0

The web has undergone some significant changes since its inception. 1.0 consisted mostly of HTML documents, with simple CSS style, and little or no JavaScript interaction. 2.0 was the AJAX revolution, making dynamic sites with complex JavaScript. Some have suggested we are already in 3.0, with HTML5 and SVG well supported in the latest version of every major browser. What I’d like to talk about, is what I wish would come next.

As many who are immersed in front-end web development have noticed, HTML and SVG have different DOMs, different styles, and competing animation tools. They have been getting better, with HTML5′s inline SVG support, and browsers beginning to bring each markup’s features to the other, but the inconsistencies are still painful, and and they make implementation both for web and browser developers sub-optimal.

What I would love to see is something akin to the following document:

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Fancy HTML+SVG</title>
  <link rel="stylesheet" href="styles.css" />
  <defs>
    <path id="logo" desc="My Fancy SVG Logo" d="M59,0 l69,69 h-15 l-44,44 v15 l-69-69 h15 l45-45 5,5 -45,45 44,44 44-44 -49-49 z  M59,44 c0-8,10-8,10,0 v40 c0,8-10,8-10,0 z" />
    <filter id="soft_blur"><feGaussianBlur in="SourceGraphic" stdDeviation=".5"/></filter>
  </defs>
  <link rel="shortcut icon" sizes="16x16 24x24 32x32 48x48" href="#logo" />
</head>
<body>
  <header>
    <a id="home" href="."><use href="#logo" /></a>
    <h1>The TaleCrafter's Scribbles<h1>
    <h2>notes about science, fiction, and faith… but mostly web development</h2>
  </header>
  <article>My Article text and images and stuff go here</article>
  <footer>Boring Legal and maybe locale selection in here</footer>
  <script src="script.js" async defer></script>
</body>
</html>

styles.css

  #logo { background:#111; } /* applies to everywhere <use>d, including favicon */
  #home { width:64px; height:64px; float:left; }
  #home path { transform:scale(.5); transition:background .5s ease; }
  #home path:hover { background:#0d0dc5; }
  h1 { filter:url(#soft_blur); transition:filter .5s linear; }
  h1:hover { filter:none; }
  /* ... lots more styles ... */

script.js

  document.querySelector('#home path').addEventListener('click', /* open menu or something useful */);

Summary of things that would be cool:

  • no need for foreignObject or anything like that, simply mix and match tags
  • put all the useful attributes in the same namespace (make use is useful without xlink: namespace)
  • css transitions & animations on svg styles (properties would also be nice)
  • defs and use in html documents
  • filters on html elements (Firefox is already working on this)
  • unify styles like background and fill
  • JavaScript DOM API identical

In short SVG and HTML would be one and the same. You would style both with the same css.

Some nitpicks:

  • I’m not sold on defining filters in markup, then using in style. It feels… odd. Why not define in style? (Oh no, that might be too much like IE’s filters! Gasp!)
  • Animating is still a crapshoot. It feels like it should be in JavaScript, but declarative syntax is so much simpler, and easier to optimize for browsers. Some SMIL animations work in some browsers. CSS animations are still nacent but promising. (Even IE looks like it might implement it in ‘native HTML5′. Sorry, couldn’t help myself.) Still, JavaScript is the only reliable way right now.

Let me hear an Amen, or let me know what I’m missing. Leave a comment and let’s talk about it.

Load only when needed, or Preload everything?

As JavaScript and web application best practices have formed over the last several years, there have appeared two contesting patterns in loading the scripts needed for an application:

Don’t load any JavaScript until you know you need it.

I usually feel like this is the way to go, because a lot of my code is specific to a particular widget or workflow. Why make the page take longer to load initially for something the user won’t do every visit? Just put in minimal stubs to load the full functionality once the user begins down that workflow, or interacts with the widget.

Pros:

  • Lighter initial page weight
  • Encourages functionally modular code
  • Memory performance boost (important if you have to support old browsers)
  • Speed performance boost (if done right)

Cons:

  • Adds additional complexity to code
  • Laggy performance (if done wrong)
  • Lots of HTTP requests

Combine and minify all JavaScript into one file loaded at the end of the html file.

You know beforehand what is going to be needed on each page, and YSlow warned you about too many HTTP requests. Bundle up all the scripts into one download which will be cached after the first page view.

Pros:

  • Easy to implement (lots of code will do it for you)
  • Initial page load (once cached) is really fast

Cons:

  • Load a lot more than usually necessary
  • Initial load can be much slower

So how do you know which pattern to follow? It depends! If your application is very complex, and large portions of the functionality are used infrequently, it makes a lot of sense to use an on-demand pattern. If your application is fairly simple, or if all of the code is likely to be used every time, then combining all of the scripts and including it from the start will be much easier.

I recently worked on a smaller application where I divided all the script into two files. The first was loaded initially, and provided enough functionality for the login dialog only. Upon successful login, the second script was loaded, which combined all of the remaining pieces of application.

The point I most want to make is this: Don’t just follow a pattern because it is a “best practice”. Take the time to figure out the best solution for your project.

I thought I new you(JavaScript);

This is the first of hopefully many posts aiming to demystify javascript.

The first thing to get over is the name. JavaScript is not Java. The name came from trying to ride on Java’s hype. JavaScript is to Java as Hamster is to Ham. Understand? Moving on…

Hopefully, most programmers now know that JavaScript is object-oriented. I’m afraid though that most believe object-oriented is synonymous with classical inheritance, which you will not find in JavaScript. JavaScript instead uses prototypal inheritance.

Classical Inheritance in Java:

class Fruit {
  private String name;
  public Fruit(String n) { name = n; }
  public toString() { return name; }
}

class Banana extends Fruit {
  public Banana() { super("banana"); }
}

// (new Banana()) instanceof Banana and Fruit

With classical inheritance, as in Java, you define classes. Classes are templates or blueprints for what an object of that type will be like. Objects, which are instances of a class, get all the methods and fields associated with the class and the classes it inherits.

When you call a method, first the runtime looks in the class, then if it can’t find the definition, it traverses up the class hierarchy until it finds the method definition.

Prototypal Inheritance in JavaScript:

function Fruit(name) { this.name = name; }
Fruit.prototype = { name:null, toString:function() { return this.name; } };

function Banana() { Fruit.call(this, 'banana'); }
Banana.prototype = new Fruit(null);

// (new Banana()) instanceOf Banana and Fruit 

As you can see in JavaScript, with prototypal inheritance, there are no classes. The ‘class’ keyword is not used. Objects inherit from other objects. (The Banana prototype is an ‘instance’ of Fruit.) Constructors are just normal functions that you may call with the ‘new’ keyword.

When you access any property, the runtime checks the object, then if it cannot find the property, it traverses up the prototype object hierarchy until it finds the property. If it doesn’t find the property, it returns undefined.

The new keyword is a little deceptive, because it looks the same as Java. This is closer to what really happens:

// var banana = new Banana(a, b);
var banana = {}; // new Object()
// assume __proto__ is a hidden field, used internally for the prototype hierarchy
banana.__proto__ = Banana.prototype;
var temp = Banana.call(banana, a, b); // call the Banana function with 'this' set to the banana object
banana = (temp && typeof temp === 'object') ? temp : banana;

// banana.name;
var temp = banana;
while (!temp.hasOwnProperty('name') && temp.__proto__) { temp = temp.__proto__; }
return temp.hasOwnProperty('name') ? temp.name : undefined;

Please note that this code is an oversimplification, but hopefully helps you to understand what is happening behind the scenes. One of the interesting things you may have noticed from the above code is that when you call ‘new Banana()’, you might not get back what you expect. See one way you can implement the Factory pattern in JavaScript:

function Fruit(name, color) {
  if (typeof Fruit[name] === 'function')) return new Fruit[name]();
  this.name = name;
  this.color = color;
  return this;
}
Fruit.prototype = { name:null, color:null };

Fruit.Banana = function Banana() { return this; };
Fruit.Banana.prototype = new Fruit('Banana', 'yellow');

Fruit.Apple = function Apple() { return this; };
Fruit.Apple.prototype = new Fruit('Apple', 'red');

var banana = new Fruit('Banana'); // instanceOf Fruit and Banana
var apple = new Fruit('Apple'); // instanceOf Fruit and Apple
var kiwi = new Fruit('Kiwi'); // instanceOf Fruit

As most of JavaScript’s powerful dynamic features, it could easily be used for evil as well as for good.

function Droid() { return new IPhone(); }
var phone = new Droid(); // this is not the droid you're looking for

wtfjs.com is full of examples where JavaScript does weird things, but almost invariably because you tried to do something weird in the code. With a small amount of restraint on the developer’s part, JavaScript can be powerful and need not be a mystery.

JavaScript Require Update

I’ve updated the code I use to require scripts and styles on my web pages.

Check it out, or fork it at github: http://github.com/thetalecrafter/require

Usage:

main file:

	require.setObjUrl('jQuery', function(name) {
		return name === 'jQuery' ? 'http://code.jquery.com/jquery-1.5.2.min.js' :
			'http://cdn-' + (name.length % 4) + '.example/plugins/' + name + '.js'; });
	require('jQuery.myplugin', function(myplugin) { /* both have loaded when this executes */ });

plugin file:

	require('jQuery', function(jQuery) {
		jQuery.myplugin = ...
	});

require css: Any requirement matching /\.css$/i will be treated as a css requirement.

	require('myplugin.css', function() { /* You can count on styles being available here */ });

require image: Any requirement matching /\.(?:gif|jpe?g|png)$/i will be treated as an image requirement.

	require('myplugin_bg.png', function() { /* You can count on the image being available here */ });

JavaScript require in 100 lines of code

UPDATE: I’ve changed up my code a bit in the follow up post: JavaScript Require Update
UPDATE: Although my initial intent was to write require with minimal code, my latest version in github is much longer, but preforms better and is much more feature rich. Check it out, or fork it at github: http://github.com/thetalecrafter/require

Lately I’ve been toying with dependency management in JavaScript. Most implementations of require (at least that I’ve seen) use polling, a function in the loaded script, synchronous XMLHttpRequest (dojo.require), or some combination of those.

Polling is less than ideal, since more code runs than is necessary. It can slow down the responsiveness of the page if the interval is too short, and the user waits longer than necessary if the interval is too long.

Putting a function in the loaded file means that everything you load has to understand the system. You cannot load arbitrary files. This makes it harder to do mash-ups involving other peoples’ code.

Synchronous requests lock up the browser. If the server is latent, the user may feel the browser has crashed, and if the server goes down, it can actually crash the browser. In addition, XMLHttpRequest responses are not cached like script tags, meaning that the dynamic packages may need to be reloaded with every page load.

So… when looking at writing my own require function I knew I wanted:

  • Event-driven code. (No polling. No more code execution than necessary.)
  • No requirements on the contents of required files.
  • Asynchronous loads (No chance of freezing or crashing the browser.)
  • Take advantage of the browser’s cache.
  • Nested requires. (A file isn’t loaded until everything it requires is loaded.)
  • Decent browser compatibility (IE6+, FF2+, Chrome, Safari 3+, Opera).
  • No external library requirements.

One thing I ended up giving up to get the aforementioned wants: Loading scripts in parallel. Nested requires were unreliable since not all browsers guarantee execution order of dynamically inserted script tags, therefore too hard to determine the parent requirement. I’m looking at you Safari. Any pointers to improve that would be appreciated.

My testing has been less than thorough, and there are many situations I didn’t try to handle. (Like checking to see if the script was already included statically.)

Without further ado, here’s my code: (the most up-to-date is available on github)

/**
 * _.require v0.3 by Andy VanWagoner, distributed under the ISC licence.
 * Provides require function for javascript.
 *
 * Copyright (c) 2010, Andy VanWagoner
 *
 * Permission to use, copy, modify, and/or distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
(function() {
	var map = {}, root = [], reqs = {}, q = [], CREATED = 0, QUEUED = 1, REQUESTED = 2, LOADED = 3, COMPLETE = 4, FAILED = 5;

	function Requirement(url) {
		this.url = url;
		this.listeners = [];
		this.status = CREATED;
		this.children = [];
		reqs[url] = this;
	}

	Requirement.prototype = {
		push: function push(child) { this.children.push(child); },
		check: function check() {
			var list = this.children, i = list.length, l;
			while (i) { if (list[--i].status !== COMPLETE) return; }

			this.status = COMPLETE;
			for (list = this.listeners, l = list.length; i < l; ++i) { list[i](); }
		},
		loaded: function loaded() {
			this.status = LOADED;
			this.check();
			if (q.shift() === this && q.length) q[0].load();
		},
		failed: function failed() {
			this.status = FAILED;
			if (q.shift() === this && q.length) q[0].load();
		},
		load: function load() { // Make request.
			var r = this, d = document, s = d.createElement('script');
			s.type = 'text/javascript';
			s.src = r.url;
			s.requirement = r;
			function cleanup() { // make sure event & cleanup happens only once.
				if (!s.onload) return true;
				s.onload = s.onerror = s.onreadystatechange = null;
				d.body.removeChild(s);
			}
			s.onload = function onload() { if (!cleanup()) r.loaded(); };
			s.onerror = function onerror() { if (!cleanup()) r.failed(); };
			if (s.readyState) { // for IE; note there is no way to detect failure to load.
				s.onreadystatechange = function () { if (s.readyState === 'complete' || s.readyState === 'loaded') s.onload(); };
			}
			r.status = REQUESTED;
			d.body.appendChild(s);
		},
		request: function request(onready) {
			this.listeners.push(onready);
			if (this.status === COMPLETE) { onready(); return; }

			var tags = document.getElementsByTagName('script'), i = tags.length, parent = 0;
			while (i && !parent) { parent = tags[--i].requirement; }
			(parent || root).push(this);
			if (parent) this.listeners.push(function() { parent.check(); });

			if (this.status === CREATED) {
				this.status = QUEUED;
				if (q.push(this) === 1) { this.load(); }
			}
		}
	};

	function resolve(name) {
		if (/\/|\\|\.js$/.test(name)) return name;
		if (map[name]) return map[name];
		var parts = name.split('.'), used = [], ns;
		while (parts.length) {
			if (map[ns = parts.join('.')]) return map[ns] + used.reverse().join('/') + '.js';
			used.push(parts.pop());
		}
		return used.reverse().join('/') + '.js';
	}

	function absolutize(url) {
		if (/^(https?|ftp|file):/.test(url)) return url;
		return (/^\//.test(url) ? absolutize.base : absolutize.path) + url;
	}
	(function () {
		var tags = document.getElementsByTagName('base'), href = (tags.length ? tags.get(tags.length - 1) : location).href;
		absolutize.path = href.substr(0, href.lastIndexOf('/') + 1) || href;
		absolutize.base = href.split(/\\|\//).slice(0, 3).join('/');
	})();
	
	function require(arr, onready) {
		if (typeof arr === 'string') arr = [ arr ]; // make sure we have an array.
		if (typeof onready !== 'function') onready = false;
		var left = arr.length, i = arr.length;
		if (!left && onready) onready();
		while (i) { // Update or create the requirement node.
			var url = absolutize(resolve(arr[--i])), req = reqs[url] || new Requirement(url);
			req.request(function check() { if (!--left && onready) onready(); });
		}
	}

	require.map = function mapto(name, loc) { map[name] = loc; };
	require.unmap = function unmap(name) { delete map[name]; };
	require.tree = root;
	jQuery.require = require;
})();

Accessing AWS SimpleDB from PHP

This week, as I built part of my App Server for Distributed Systems Design, I hit another stumbling block. The library that Amazon provides in PHP for accessing SimpleDB requires PHP 5.2. I should have known that I need to use the latest version.

Not only did Amazon’s library not work for me, but it was huge and complicated. I found another library at: Google Code, but as fate would have it, that library didn’t work either. The code was pretty ugly imho, but at least it was straighforward enough for me to understand how accessing SimpleDB worked, which led me to make my own SimpleDB client.

The script will work with any PHP 5, and doesn’t depend on anything that isn’t built in by default. I hope it is helpful to someone else. It would be really easy to add the SimpleDB requests I haven’t implemented yet.

<?php
/**
 * AWS_SimpleDB_Client v0.1 by Andy VanWagoner, distributed under the ISC licence.
 * Provides simple access to Amazon's SimpleDB from PHP 5.
 *
 * Copyright (c) 2009, Andy VanWagoner
 *
 * Permission to use, copy, modify, and/or distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

class AWS_SimpleDB_Client {

	// AWS SimpleDB API Constants
	private static $service_endpoint	= "sdb.amazonaws.com";
	private static $api_version			= "2007-11-07";
	private static $timestamp_format	= "Y-m-d\TH:i:s.\\\\\Z";
	private static $signature_version	= 1;

	private static $user_agent = "AWS_SimpleDB_Client 0.1 - Andy VanWagoner";

	/**
	* Constructor
	*
	* @param string $access			// your AWS "Access Key ID"
	* @param string $secret			// your AWS "Seceret Access Key"
	*/
	function AWS_SimpleDB_Client($access, $secret) {
		$this->access_key = $access;
		$this->secret_key = $secret;
	}

	/**
	* AWS SimpleDB API - CreateDomain
	* NOTE: This call will take a while (AWS says 10 seconds)
	*
	* @param string $domain			// the domain to create
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function create_domain($domain) {
		$params = array(
			'Action' => 'CreateDomain',
			'DomainName' => $domain
		);

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - DeleteDomain
	* NOTE: This call will take a while (AWS says 10 seconds)
	*
	* @param string $domain			// the domain to delete
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function delete_domain($domain) {
		$params = array(
			'Action' => 'DeleteDomain',
			'DomainName' => $domain
		);

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - ListDomains
	*
	* @param string $next = ''		// Optional - Sent as NextToken parameter
	* @param string $max = 100		// Optional - Sent as MaxNumberOfDomains
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'DomainName'=>array('...', ...) [, 'NextToken'=>])
	*/
	function list_domains($next = '', $max = 0) {
		$params = array('Action' => 'ListDomains');

		if ($max > 0 && $max) post($params);
	}

	/**
	* AWS SimpleDB API - PutAttributes
	*
	* @param string $domain			// The domain the item is in
	* @param string $item			// The name of the item
	* @param array  $attributes		// array(array('Name'=>, 'Value'=> [, 'Replace'=>]), ...)
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function put_attributes($domain, $item, $attributes) {
		$params = array(
			'Action' => 'PutAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		foreach($attributes as $i => $value) {
			$params["Attribute.$i.Name"] = $value['Name'];
			$params["Attribute.$i.Value"] = $value['Value'];
			if (isset($value['Replace']))
				$params["Attribute.$i.Replace"] = $value['Replace'];
		}

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - DeleteAttributes
	*
	* @param string $domain			// The domain the item is in
	* @param string $item			// The name of the item
	* @param array  $attributes		// array(array('Name'=>, 'Value'=>), ...)
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function delete_attributes($domain, $item, $attributes) {
		$params = array(
			'Action' => 'DeleteAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		foreach($attributes as $i => $value) {
			$params["Attribute.$i.Name"] = $value['Name'];
			$params["Attribute.$i.Value"] = $value['Value'];
		}

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - GetAttributes
	*
	* @param string $domain			// the domain name
	* @param string $item			// the item's name
	* @param string $attribute		// Optional - If specified, only this attribute's values are retrieved.
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'Attribute'=>array(array('Name'=>,'Value'=>), ...))
	*/
	function get_attributes($domain, $item, $attribute = '') {
		$params = array(
			'Action' => 'GetAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		if ($attribute)
			$params['AttributeName'] = $attribute;

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - Query
	*
	* @param string  $domain		// The domain name
	* @param string  $query			// The query to run on this domain
	* @param string  $next = ''		// OPTIONAL - token supplied on last paged call
	* @param integer $max = 100		// OPTIONAL - max items you want returned 1-250, default = 100
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'ItemName'=>array('...', ...))
	*/
	function query($domain, $query, $next = '', $max = 0) {
		$params = array(
			'Action' => 'Query',
			'DomainName' => $domain,
			'QueryExpression' => $query
		);

		if ($max > 250) $max = 250;
		if ($max > 0)
			$params['MaxNumberOfItems'] = $max;
		if ($next)
			$params['NextToken'] = $next;

		return $this->post($params);
	}

	/**
	 * Sign the parameters, following AWS version 1 signing
	 *
	 * @param array $params			// array of all (except for the signiture) params to be passed to amazon
	 *
	 * @return string				// signature string
	 */
	private function sign($params) {
		uksort($params, 'strnatcasecmp');

		$data = '';
		foreach ($params as $key=>$value) {
			$data .= $key . $value;
		}

		return base64_encode (	pack("H*", sha1((str_pad($this->secret_key, 64, chr(0x00)) ^ (str_repeat(chr(0x5c), 64))) .
								pack("H*", sha1((str_pad($this->secret_key, 64, chr(0x00)) ^ (str_repeat(chr(0x36), 64))) .
								$data)))) );
	}

	/**
	 * POST to AWS SimpleDB and then parse the response.
	 *
	 * @param array $params			// all params to pass on the post
	 *
	 * @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>, ...)
	 */
	private function post($params) {

		// Add all of the common parameters needed by AWS SimpleDB
		$params['AWSAccessKeyId']	= $this->access_key;
		$params['Timestamp'] 		= gmdate(self::$timestamp_format, time());
		$params['Version'] 			= self::$api_version;
		$params['SignatureVersion']	= self::$signature_version;
		$params['Signature'] 		= $this->sign($params);

		// Generate the POST request
		$content = http_build_query($params);

		$post  = 'POST / HTTP/1.0'															. "\r\n";
		$post .= 'Host: ' 			. self::$service_endpoint 								. "\r\n";
		$post .= 'Content-Type: ' 	. 'application/x-www-form-urlencoded; charset=utf-8'	. "\r\n";
		$post .= 'Content-Length: ' . strlen($content)										. "\r\n";
		$post .= 'User-Agent: ' 	. self::$user_agent 									. "\r\n";
		$post .= 																			  "\r\n";
		$post .= $content;

		$socket = @fsockopen(self::$service_endpoint, 80, $errno, $errstr, 10);
  		if ($socket) {
			fwrite($socket, $post);

			$response = stream_get_contents($socket);
			fclose($socket);

			// Parse the response
			return $this->format_result($response);
		}

		// Return a fail result
		return array('status' => array('code' => 404, 'message' => 'Not Found'),
			'Error' => array('Code' => $errno, 'Message' =>
				'Could not connect to ' . $this->$service_endpoint . " ($errstr)"
			)
		);
	}

	/**
	 * Take the XML document returned by AWS SimpleDB, and transform it into a hash
	 *
	 * @param string $result		// the full http response string from SimpleDB
	 */
	private function format_result($result) {
		list($http_headers, $content) = explode("\r\n\r\n", $result, 2);
		$header_lines = explode("\r\n", $http_headers);
		list($protocol, $code, $message) = explode(" ", $header_lines[0], 3);

		// record the http status
		$formatted = array('status' => array('code' => $code, 'message' => $message));

		$xml = simplexml_load_string($content);

		// Look for Errors
		if (isset($xml->Errors)) {
			$formatted['RequestId'] = (string)$xml->RequestId;
			$formatted['Error'] = array();
			foreach($xml->Errors->Error as $error) {
				array_push($formatted['Error'], array(
					'Code' => (string)$error->Code,
					'Message' => (string)$error->Message
				));
			}
			return $formatted;
		}

		// Get the metadata for this request
		$metadata = $xml->ResponseMetadata;
		$formatted['RequestId'] = (string)$metadata->RequestId;
		$formatted['BoxUsage'] = (string)$metadata->BoxUsage;

		// GetAttributes Response
		if (isset($xml->GetAttributesResult)) {
			$formatted['Attribute'] = array();
			foreach($xml->GetAttributesResult->Attribute as $attribute) {
				array_push($formatted['Attribute'], array(
					'Name' => (string)$attribute->Name,
					'Value' => (string)$attribute->Value
				));
			}
		}

		// ListDomains Response
		if (isset($xml->ListDomainsResult)) {
			$formatted['DomainName'] = array();
			foreach($xml->ListDomainsResult->DomainName as $domain) {
				array_push($formatted['DomainName'], (string)$domain);
			}
			if (isset($xml->ListDomainsResult->NextToken)) {
				$formatted['NextToken'] = (string)$xml->ListDomainsResult->NextToken;
			}
		}

		// Query Response
		if (isset($xml->QueryResult)) {
			$formatted['ItemName'] = array();
			foreach($xml->QueryResult->ItemName as $item) {
				array_push($formatted['ItemName'], (string)$item);
			}
			if (isset($xml->QueryResult->NextToken)) {
				$formatted['NextToken'] = (string)$xml->QueryResult->NextToken;
			}
		}

		return $formatted;
	}
}

?>

JSON and POST in PHP

As I’ve been trying to do Lab 2 without having to modify my ami or change my apache configuration, I’ve found some nice helpers.

First, trying to encode and decode JSON in PHP 5.2 is easy… you just use the built in functions json_encode() and json_decode(). However, my Fedora ami is only running PHP 5.03. So, how do I use JSON without recompiling my php installation, or downloading 5 billion files? Michal Migurski created a php-json library that is now a part of PEAR, but he still has a copy of his original encoder/decoder at http://mike.teczno.com/JSON/JSON.phps. It’s licenced BSD-style so have at it.

Next, I wanted to sent http requests by POST, including file uploads, again without downloading 5 billion files or messing with my ami. My solution was to actually learn the ‘application/x-www-form-urlencoded’ format and ‘multipart/form-data’ format and send the HTTP request across a socket.

A resource that helped me with the ‘application/x-www-form-urlencoded’ format is on www.wellho.net. For the ‘multipart/form-data’ format http://chxo.com/be2/20050724_93bf.html was very helpful. One gotcha to remember though is that PHP heredoc strings usually use \n line endings. While this may not cause any problems, to be safe and consistent with HTTP, you should use \r\n line endings.

Putting the two together into one function gave me the following:

function http_post($host, $path, $data_hash, $file = '', $file_param_name = '') {
	$boundary = md5(uniqid());
	if ($file && $file_param_name) {
		$binary = file_get_contents($file['tmp_name']);

		$content_type = "multipart/form-data; boundary=$boundary";

		$items = array();
		foreach (array_keys($data_hash) as $key) {
			array_push($items, "--$boundary\r\nContent-Disposition: form-data; name=\"$key\"\r\n\r\n{$data_hash[$key]}\r\n");
		}
		array_push($items, "--$boundary\r\nContent-Disposition: form-data; name=\"$file_param_name\"; filename=\"{$file['name']}\"\r\n");
		array_push($items, "Content-Type: {$file['type']}\r\nContent-Transfer-Encoding: binary\r\n\r\n$binary\r\n--$boundary--\r\n");
		$data = implode('', $items);
	} else {
		$content_type = 'application/x-www-form-urlencoded; charset=UTF-8';

		$items = array();
		foreach (array_keys($data_hash) as $key) {
			array_push($items, urlencode($key) . '=' . urlencode($data_hash[$key]));
		}
		$data = implode('&', $items);
	}

	$content_length = strlen($data);
	$fp = fsockopen($host, 80);
	fputs($fp, "POST $path HTTP/1.1\r\n");
	fputs($fp, "Host: $host\r\n");
	fputs($fp, "Content-Type: $content_type\r\n");
	fputs($fp, "Content-Length: $content_length\r\n");
	fputs($fp, "Connection: close\r\n\r\n");
	fputs($fp, $data, $content_length);

	$http_response = stream_get_contents($fp);
	fclose($fp);

	list($headers, $body) = explode("\r\n\r\n", $http_response, 2);
	return $body;
}

Note that the $file parameter would be $_FILES['your-form-input-name'], and $file_param_name would be ‘your-form-input-name’. $data_hash, I assume would be obvious. It’s an associative array with key => value pairs to send. The upload file would not appear in $data_hash.

Distributed System Design

This is my last semester at BYU, and I am taking Distributes System Design. As I go through the preocess of creating (with my classmates) a distributed web application, I will be logging my progress and experiences here.

P.S. – Happy Civil Rights Day! I’m using this holiday to catch up on my homework.

Follow

Get every new post delivered to your Inbox.