Charles Engelke’s Blog

May 29, 2012

Fluentconf workshop: Backbone.js Basics and Beyond

Filed under: Uncategorized — Charles Engelke @ 7:29 pm

Unlike my first workshop today, my second workshop at FluentConf covers a subject completely new to me:  Backbone.js. I’ve heard a lot about it, but never even downloaded it. Looking forward to learning a lot.

“Backbone thinks of itself as being lightweight.” It isn’t opinionated like Ruby on Rails, so Backbone projects can do the same things in very different ways. She’s going to show her ideas of the best way, but our ideas may vary.

Backbone is not MVC, even though parts of it have the same names as in server-side MVC frameworks (Models and Views). Backbone adds Templates to those two, not controllers.

The speaker came to JavaScript through Rails. At the time that meant that Rails wrote her JavaScript; she didn’t have to. Now she feels that is kind of like using scaffolding – a shortcut that won’t carry you far enough. Next, she used jQuery extensively. That’s powerful, but can be messy and hard to test other than with something like Selenium. Phase 3 was page objects. Create a unit testable object that has the JavaScript for the page. That seems to describe how she uses Backbone.

Backbone gives you model mirroring, views that handle events (and can render DOM). Models in Backbone are like MVC models and may mirror server-side ones (or something like them rather than one-to-one). Server-side views correspond to Backbone Templates. Server-side controllers correspond to Backbone Views.

The talk covers various tasks you need to perform, and how to do them with Backbone, ending with how it all fits together. I wish that had come first. Maybe it’s me, but I need the overall context to be comfortable with the pieces. Basically, set it all up by creating an app object with an initialize method that you call when your document is ready. That can set up the model, fetch the data, and use a view to render it.

Testing? Pivotal uses Jasmine, and there’s a talk about it tomorrow at 1:45.

Backbone is really good at interacting with a RESTful API, living in harmony with other frameworks and styles of JavaScript, and handling unique applications (due to its flexibility). On the other hand, it doesn’t have UI widgets, and it’s not good for developers who aren’t already strong in JavaScript (because it doesn’t give enough direction to them).

The talk is over very early. And all in all, I’m disappointed. I go to a half day workshop expecting to come away ready to actually create something with my new knowledge, not just get a survey of the topic. I could have learned as much about Backbone in a 30 minute talk as in this workshop.

Fluentconf workshop: Breaking HTML5 Limits on Mobile JavaScript

Filed under: Uncategorized — Charles Engelke @ 3:28 pm
Tags: , , ,

O’Reilly’s Fluent Conference starts today with optional workshops. My morning selection is on JavaScript on mobile platforms, given by Maximiliano Firtman of ITMaster Professional Training. This post is just a stream-of-consciousness list of points I want to remember, rather than real notes for the talk.

In his introduction, he points to a resource on available APIs: www.mobilehtml5.org.

Mobile web development is different:

  • Slower networks
  • Different browsing (touch versus mouse, pinch to zoom, pop-up keyboard, etc.)
  • Different behavior (only current tab is running, file uploads and downloads)
  • Some browsers are proxy based (Kindle Fire, Opera Mini)
  • too many browsers (more than 40), some too limited, some too innovative, mostly without documentation, mostly unnamed, most without debugging tools
  • Four big rendering engines, five big execution engines

Check gs.statcounter.com for browser market shares. Much more even distributed among top seven dozen browsers.

Web views embed an HTML window in a native app. On iOS, web views have a different execution engine than the browser (2.5 times slower!). They often have differences in how they support HTML5 APIs.

Pseudo-browsers (his term) are native apps with a web view inside. They you don’t get a new rendering engine or execution engine, you just get new behaviors added by the native shell it is wrapped in. (Yahoo Axis, for example)

(Note to me: he’s using IPEVO P2V for Point2View cameras showing a mobile phone on a camera.)

Phonegap and similar tools for creating native apps use web tools but are native.

Remote debugging is available for some browsers with Remote Web Inspector. Adobe Shadow is a new debugging tool that’s free (at least, for now). Weinre can work with Chrome, making iPhone remotely debuggable. Looks pretty interesting.

Paper Who Killed My Battery from WWW2012 shows how different web sites consume power from your device’s battery. For example, 17% of the energy used to look at Amazon’s web site is for parsing JavaScript that isn’t ever used.

The speaker has a neat development tool he calls Chevron for working inside the browser, available at firt.mobi/material/FluentConf.zip. It has an in-browser code editor, and can save on-line to a unique URL. It will display a QR-code for that URL, so you can see what you’re developing on your mobile device as well as the built-in browser window. Very nice.

A service at www.blaze.io/mobile will run your public web page on a real device of your choice, and give you performance metrics on it.

You can build a real app (even offline) in the browser with HTML5, but it doesn’t look native on a mobile device. But (for Apple and maybe others) you can get it a lot closer with some meta tags:

<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black">
<link rel="apple-touch-startup-image" href="launch.png">

A lot of the second half of this talk is more on HTML5 in general (when it works in mobile browsers, too) than specific mobile issues. Most of the audience is finding this very useful, but it’s not new to me. Unfortunately, it doesn’t seem that he’s going to get to the Device Interaction part of his demonstrations, which I would really like to see. I can always fiddle with them myself later, I guess. But he’s a good speaker and I’d like to hear him talk about them.

You can use the orientationchange event (onorientationchange property) to run code when the device moves between portrait and landscape views. You can also check for going on and off-line with the online event (though this is not generally reliable).

Ah, he’s getting to Device Interaction! Geolocation first, which is neat but has been available for a while. But then a lot of really new capabilities, some of which only run on one or two browsers now. I need to start using Firefox on my Android phone.

A very useful talk and good kickoff to the conference for me.

March 10, 2012

Dehydration and popping ears

Filed under: Uncategorized — Charles Engelke @ 10:52 pm

A few days ago I was taking a tour of Corcovado National Park in Costa Rica and I noticed that my hearing was muffled. I don’t hear that well normally, but this felt like I was wearing earplugs. That sometimes happens when I fly, and I just yawn to make my ears pop and the problem goes away.

That didn’t work this time. My ears stayed muffled through the whole day out. But after the tour and the long boat ride back, they finally popped and my hearing came back to normal during the drive back to my hotel.

What happened? I was severely dehydrated during the tour (the tour operator did not bring nearly enough water for such a hot day) and I picked up several liters of bottled water on the drive home. Just as I was finishing the first liter, my ears popped and my hearing came back. I searched for information on this condition and found a lot of pages saying that heavy exercise can cause it and cooling down will make your ears pop again. But I had several hours sitting on the boat after the exercise and my ears did not pop until I got a lot of water in me. I’m certain that this was caused by dehydration.

I had something similar happen at work a few months ago, and I now think it was also due to dehydration. My doctor had me nearly eliminate caffeinated and carbonated drinks and I hadn’t yet got used to making up for it with a lot more water. Looking back, I think the dehydration affected my Eustachian tubes. Clearly, I have to pay more attention to getting enough to drink.

February 28, 2012

mod_perl Problems

Filed under: Uncategorized — Charles Engelke @ 2:04 pm

I’ve just spent days trying to get mod_perl to work with Perl 5.12 or later, and it’s finally there on both Windows and Linux. I may post more detailed notes, but before I forget, here’s an important note to me.

The mod_perl.so file I needed for ActiveState Perl 5.12 and Apache 2.2 can be downloaded from cpan.uwinnipeg.ca/PPMPackages/12xx/x86/. Specifically, cpan.uwinnipeg.ca/PPMPackages/12xx/x86/mod_perl.so .

I found a bunch of other downloadable binary versions of this file, but none of them worked with my 32-bit Windows Apache and ActiveState Perl 5.12. This one did.

I haven’t found any that work with Perl 5.14 for Windows.

January 8, 2012

Chrome Web App Bookshelf – Part 7 of 7

Filed under: Uncategorized — Charles Engelke @ 1:36 pm

Note: this is part 7 (the final part) of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent, part 2 added basic functionality, part 3 finally called an Amazon web service, part 4 parsed the web service result, part 5 was useful enough to publish, and part 6 covered publishing it at my own website. This post finishes the series by publishing in the Chrome Web Store instead.

I’m almost done with getting my app out into the world now. I just have to put in the Chrome Web Store. Once that’s done I intend to update it with new features from time to time, but probably won’t post about that in any detail. Instead, I’ve put this project on Github. If you’re interested, you can follow its development there.

Following my practice to date I haven’t bothered to read any of the documentation about the store. Instead I just looked for information on how to develop for it, starting with the Settings icon in the upper right corner of the page:

Settings icon in Chrome Web Store

When I clicked on the gear icon the drop-down menu showed a choice for Developer Dashboard, so I chose that. The resulting page looks like it’s going to guide me through the process pretty easily. There’s a link to “Start uploading your apps now!“. Seems promising…

Developer dashboard section to Upload your app

It sure looks easy. I’m not supposed to upload the CRX file, just a ZIP of the directory. I just posted such a file for my previous entry. I’m a bit worried because I put in an auto-update URL that doesn’t make sense for the app store, so I’m going to remove both the homepage and update URLs from the manifest before uploading to see what happens. I’m also going to increment the version number to reduce my own confusion.

When I uploaded the resulting ZIP file I got a page showing how the web store sees it, starting with the icon:

App summary with placeholder icon

Where’s my icon? A bit further down the page I’m offered the chance to upload another icon, saying it should be 96 pixels square. But instead I just uploaded my 128 by 128 icon again, and it took it and it looks good:

Chrome store summary showing my icon

Going down the page, I’m asked for a detailed description, so I filled it in as follows:

Want to keep track of books that haven’t been published yet, so you can decide whether to buy them when they’re ready? This app allows you to add books by ISBN or Amazon’s ASIN (including Kindle books) and keep a list showing their scheduled release date and shipping status. When you’re ready to buy, just follow the link from the app.

This app uses Amazon’s Product Advertising API, so you will need an Amazon Web Services account to use it. Accounts are free to register, and this particular API incurs no charges.

Next, I’m asked for a screen shot and at least one “promotional image”. Huh. Okay, I’ll make them up. My promotional image is just a large version of the icon on a dark background. After I filled in the rest of the form as best I could, I saved the draft, returning to the dashboard:

Dashboard showing current status with publish link

Okay, let’s try it. I pressed Publish. And got a confirmation:

Publish confirmation image

Okay, let’s try it. And… well, I was kind of expecting this:

Pay $5 now

I need to pay this once to register. That’s not much, so I went ahead and paid it through Google. I then had to click Publish again, and the listing now shows an option for Unpublish. I guess I’m done. When I click the link showing for the item, it looks like I’ve got it in the store!

My app in the Chrome Web Store

And when I click to install it, it first shows me the permissions it requires:

Install confirmation

And it installed it just fine. I’ve got two nearly identical apps showing now, the previously packaged one and the new one from the store. Chrome thinks they are different because they have different unique IDs. They don’t share storage either. I’ll leave the old app there for a while, but don’t intend to update it.

Will this make the app sync to each of my browsers? It hasn’t as of this writing, but I’ll give it some time. Enhancements to this app will be posted on my Github repository, and published in the Chrome Web Store for anyone that wants to keep following it.

[Update a few hours later: the extension did finally sync to my other computers! The data did not, which I expected. I can either do that with some other facility (perhaps AWS SimpleDB) or wait until Chrome adds an API for that, which I think is in the works.]

January 7, 2012

Chrome Web App Bookshelf – Part 6

Filed under: Uncategorized — Charles Engelke @ 3:52 pm

Note: this is part 6 of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent, part 2 added basic functionality, part 3 finally called an Amazon web service, part 4 parsed the web service result, and part 5 actually was somewhat useful. The series is almost done now. This post will cover packaging and privately publishing the app, and the next and final post will cover putting it in the Chrome Web Store.

There’s more functionality I want to add to the app eventually (updating the data on saved books, deleting books from the list, and even synchronizing the list of books between PCs) but the purpose of this series is to show how to create and publish Chrome web apps. My app is just barely functional enough now to publish, so I’m going to go ahead and do that.

Since my app contains software and images from others I’m going to have to add some acknowledgment of that fact. I want to be sure I’m complying with the license conditions when I distribute those pieces, and I should also specify whatever the license conditions are of my app. So I added a file called LICENSE to my project directory, spelling out my license terms. You can see the current version of that file here. As you can see, I chose the MIT license for my app because I feel that’s one that least encumbers the users.

One licensing issue I encountered was that the Stanford JavaScript Crypto Library includes patented code, and the conditions of its use apparently require either purchasing a license to the patent or using only the GPL, at least in the United States. I’m not a lawyer so I might not be understanding this clearly, but I don’t want to violate the terms or intent of that patent holder. Other than that issue the library can be licensed under the BSD license, which seems to be compatible with the MIT license I chose. That library includes tools to build subsets of the whole thing, so that’s what I did. The patented code is used in cipher algorithms, so I built a library without ciphers (in fact, it has the minimum functionality that I need) and am including only that. I believe that means I’m fine distributing it as part of my MIT licensed app.

I also added copyright notices to the files I created: main.html, main.js, aws.js, and main.css. I don’t think that the manifest file is actually a creative work, so didn’t put any copyright notice in it. I don’t even know where it could have gone had I wanted to add one.

I know a lot of people think that explicitly specifying copyright and license conditions aren’t really necessary unless you want to restrict use of your work, but as someone who builds commercial software for a living I can tell you that they’re very important even if you don’t want to restrict that use. Without clear indication of the conditions, nobody would risk reusing what you created in any professional or commercial endeavor.

Okay, the app has been slightly polished up, is (just barely) functional enough to actually use, and has clear claims and acknowledgment of ownership and licensing. I’m ready to publish it. But how?

I started by recognizing that once I publish it I will want to update it at times. Every update should have a higher version number. So far I’ve left that number as “1” in the manifest, but for publishing I’ll take advantage of the fact that Chrome allows that version number to be up to four integers, separated by periods, and start over at version “0.0.0.1”. With that done, I can package the app using any running copy of the Chrome web browser.

First, open the Extensions panel by choosing Tools/Extensions from the drop-down menu you get when you click the wrench icon in the upper right hand corner of the browser. When I do that, I see the unpacked version of the app I have been working on:

Extensions panel showing the unpacked app

When I click the Pack extension… button I get the following dialog box:

Pack Extension... dialog box

I enter the directory I have been working in, leave the Private key file field empty, and click Pack Extension. Chrome tells me what it has done now:

Message shown after packing

It created a Chrome extension file (ending in .crx) and a new key file for me to use (ending in .pem), and told me where they are. Next I opened the file in Chrome by dragging and dropping on to the browser, and was asked whether to install it. After I said yes, the Extensions panel showed the app twice:

Extensions panel showing two versions

The top one is the unpacked app I’ve been working on, and the lower one is the actual packaged application. I then removed the unpacked version and tried out the packed one. It worked!

But it’s still not really ready. If I create a new version there’s no way that Chrome will know about it. If I host the app on a web site I can configure the manifest so that Chrome will regularly check for updates and install them if they exist. But to make that happen I have to also create an XML file describing the current version of the app. I guess Chrome doesn’t want to have to download a whole app just to see if it’s updated, and prefers a small XML file for that. I have to put the URL for that XML file in the manifest. While I’m was at it I also added the URL of a home page for more information about the app and incremented the version number. The manifest file now looked like:

{
   "name": "Books to Buy",
   "description": "Keep a list of books to buy on Amazon, with their price and availability",
   "version": "0.0.0.2",
   "app": {
      "launch": {
         "local_path": "main.html"
      }
   },
   "icons": {
      "16":    "icon_16.png",
      "128":   "icon_128.png"
   },
   "homepage_url": "http://www.bibliote.ch/",
   "permissions": [
      "https://webservices.amazon.com/*"
   ],
   "update_url": "http://www.bibliote.ch/files/bookshelf.xml"
}

The homepage_url and update_url entries are new. Of course, since I’m referring to an XML file at a particular URL I’d better create that file and host it at the URL. The format of the XML file is pretty simple; I just copied an example and replaced values with the right ones for my app:

<?xml version='1.0' encoding='UTF-8'?>
<gupdate xmlns='http://www.google.com/update2/response' protocol='2.0'>
  <app appid='mpcejinifahkdfhnfimbcckdllahpbmg'>
    <updatecheck codebase='http://www.bibliote.ch/files/bookshelf.crx' version='0.0.0.2' />
  </app>
</gupdate>

I had to fill in the correct codebase value with the URL I am hosting the app file itself at, the version value with the current version number, and appid with the correct value.

appid? What’s that?

Chrome assigns a unique ID to every application it creates. That’s the value needed here. To see it I just looked at the Extensions panel and clicked the gray arrow to the right of my application’s icon, then I cut and pasted it to here.

After I rebuilt the extension with the new manifest (I had to fill in the Private key file name this time, matching the one created the first time I packaged the app) I uploaded the XML and CRX files to the URLs shown in the manifest and XML file. I also set the Content-type of the CRX file to application/x-chrome-extension, though that’s probably not needed given that the file name ends in .crx. I uninstalled my app to get a clean environment and visited www.bibliote.ch/files/bookshelf.crx with Chrome. I was asked whether to install the app. When I agreed, it installed and worked file.

So, does auto-updating work? I changed the version numbers in the manifest and XML file, packaged the new version, and uploaded the new XML and CRX files. And, a little later, the new version showed up in my browser. Success!

If you want to try this yourself, the entire application directory I used for this is available here. You’ll have to create your own XML file by copying the one above and changing the values as needed.

I could stop here but there is one Chrome app feature I want and don’t yet have: synchronizing extensions across different browsers. From my reading of the documentation that should be working now, but it isn’t. Either applications like mine are treated special and not synchronized, or else you need to publish in the Chrome Web Store to get this functionality. I’d like to explore how that store works anyway, so I’ll try it out next time and see if synchronization starts working.

January 1, 2012

Chrome Web App Bookshelf – Part 5

Filed under: Uncategorized — Charles Engelke @ 6:21 pm

Note: this is part 5 of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent, part 2 added basic functionality, part 3 finally called an Amazon web service, and part 4 parsed the web service result. This part will actually reach the point of usefulness!

The app so far isn’t really useful because it doesn’t remember books to buy later. And that was the whole point. So instead of just displaying the response I’m going to save it persistently using the localStorage capability of HTML5. That’s simply an object (a property of the global window object) that retains its value even when a web page (or the browser as a whole) is closed. Each web site has its own localStorage area. There’s an API for it and it also can work pretty much like a regular JavaScript object. So I’m going to keep all the response data in it.

localStorage does have some pretty severe restrictions. The biggest one is that it can only store strings. At one time it was supposed to be able to store any object that could be serialized into a string, but as of now Chrome seems to only want strings there. So I’m going to have to serialize and deserialize all my objects to store them.

localStorage gives me a solution for my other problem, too. I can’t distribute the app with my Amazon Web Services credentials in it, but I can save the proper credentials persistently and let each user save his or her own values. So the app will have to have two faces now. One is the main app I’ve been working on, and the other is a simple one to use to enter the credentials. If there are no credentials in localStorage I’ll show the settings screen, otherwise I’ll show the main app.

So the main.html page now needs a body with two divs, one for each situation:

   <div id="settings">
   </div>
   <div id="application">
   </div>

It also needs to add a link to a stylesheet in the head of the document, as follows:

   <link rel="stylesheet" type="text/css" href="main.css" />

And, of course, it needs a stylesheet. It will start out very simple, just a rule to hide both divs. The JavaScript will show the proper div once it examines localStorage:

#settings, #application {
   display: none;
}

So, what about the main.js program? It starts out by again waiting for the document to be ready, declaring “global” variables, and setting up event handlers for buttons, but then it immediately checks localStorage to see whether to show the settings page or not:

   if (localStorage.accessKeyId && localStorage.secretAccessKey) {
      showApplication();
   } else {
      showSettings();
   }

The showSettings function is very simple:

   function showSettings(){
      $("#settings").show();
   }

All that is in that div are a couple of labeled data entry fields and a button to save the values entered into them. When that button is clicked, the handler just saves them in localStorage and starts the application:

   function saveSettings(){
      localStorage.accessKeyId = $("#accesskeyid").attr("value");
      localStorage.secretAccessKey = $("#secretaccesskey").attr("value");
      $("#settings").hide();
      showApplication();
   }

And showApplication just shows the proper div and creates an AWS object. The rest of its functionality happens when the button is clicked to add a new book to the list.

   function showApplication(){
      aws = new AWS(localStorage.accessKeyId, localStorage.secretAccessKey, "engelkecom-20");
      $("#application").show();
   }

That “engelkecom-20″ is my AWS associate ID. I’ll hard code that so that any URLs created by the application include it. That way, should anyone ever use this app I have a chance to make some commissions from Amazon sales.

The remaining big difference is that the result from looking up an item at Amazon will be saved in localStorage, and the results div will be replaced by a results unordered list. When the app first starts up and each time a book is looked up that list will be redrawn. This is accomplished by first replacing the reference to insertResponse in the aws.itemLookup function call with a reference to a new function called saveResponse:

      function saveResponse(response){
         localStorage.setItem("asin_"+response.asin, JSON.stringify(response));
         displayBookList();
      }

I’m naming each item with the prefix asin_ followed by Amazon’s ASIN for the item. That serves two purposes. It lets me look up a book directly by Amazon’s unique ID when I need to, and it lets me recognize which fields in localStorage hold book data and which don’t. Since I can only reliably store strings, I use the build-in JSON.stringify function to convert the object to a string without losing any information.

The displayBookList function will also be called at the end of the new showApplication function. In each case, it clears out all the items in the results unordered list and adds all the saved items back to it:

   function displayBookList(){
      var books = [];
      var i, key;

      $("#results li").remove();

      for(i=0; i<localStorage.length; i++) {
         key = localStorage.key(i);
         if (key.substr(0,5)=="asin_") {
            books.push(JSON.parse(localStorage.getItem(key)));
         }
      }

      books.sort(byReleaseDate).forEach(function(book){
         insertBook(book);
      });
   }

This first removes all the list items in the results list, then builds an array of all the books found in localStorage, and then calls insertBook to insert each book in the (by now sorted) array into the results list. insertBook is pretty much the same as the old insertResponse function shown before, with just a bit different HTML markup in it, so I’m not going to include it here. Note the use of JSON.parse to convert the string back to a JavaScript object.

Does it all work? Let’s see. First, when first launched it should show the settings panel, and it does:

Settings panel

After credentials have been saved, it should show pretty much the old application:

App screen with no books

And, once a few books have been added, it should show a list of them:

App showing a list of books

And even if the browser is closed and later re-opened, it still shows the list of books.

This was my core goal for this app. It’s ugly in a lot of ways (not just appearance; the code could use a lot of cleaning up, too). But I’ll see to that later. Now it’s time to move on and look at how to distribute the app. First, I’ll package it and try that out. Then I’ll try hosting it at a known address. Finally, I’ll put it in the Chrome Web Store. I’ll also put up a zip file of all the code created so far, so anybody who likes can try things out.

All of that will start in my next post.

December 28, 2011

Chrome Web App Bookshelf – Part 4

Filed under: Uncategorized — Charles Engelke @ 4:25 pm

Note: this is part 4 of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent, part 2 added basic functionality, and part 3 finally called an Amazon web service. This part will finally parse the web service result and refine the call.

When I got to the end of the last post, the app was just dumping a lot of XML, formatted as text, into the web page. I want to pull just the data fields I’m looking for, format them, and put that in the page instead. Those fields are: author, title, release date, availability, list price, Amazon’s price, and a link to the Amazon page. So I copied the text from the page to a file and viewed it in a program that showed the XML as an outline. It’s way too big to show here, but the structure of the response looked like this:

  • ItemLookup
    • Operation Request
    • Items
      • Request
      • Item
        • ASIN
        • DetailPageURL
        • ItemAttributes

It’s that Item element that actually has the response in it, with fields inside ItemAttributes containing most of the information I want. I see Author, Title, PublicationDate, and ListPrice inside of ItemAttributes, and DetailPageURL has the link I need. But I’m missing the availability and Amazon’s price. So back to the ItemLookup function’s documentation, which I’m a bit more ready to understand now. My request can include a specification of one or more ResponseGroup values. The default is just ItemAttributes, which is what I got in this sample response. But other groups might have the two fields I’m missing. After browsing through the documentation, I see that Offers includes both the Amount and Availability, so I’ll add that to my request. Doing so is easy: just change the line that specifies the requested ResponseGroup to:

      params.push({name: "ResponseGroup", value: "ItemAttributes,Offers"});

The resulting XML has what I need, structured as follows:

  • ItemLookup
    • Operation Request
    • Items
      • Request
      • Item
        • ASIN
        • DetailPageURL
        • ItemAttributes
        • OfferSummary
        • Offers
          • Offer
            • OfferListing

The extra information I’m looking for is in the OfferListing element, which contains Price and Availability fields. The Price field (like the ListPrice field mentioned earlier) is itself complex, containing an Amount (an integer, apparently equal to the whole number of cents), a CurrencyCode (USD in my example), and a FormattedPrice. I’m going to go with the FormattedPrice field for now, but I may want to change my mind later.

Okay, this XML has the data I want, how do I get to it? This is the job of extractAndReturnResult, which currently looks like:

      function extractAndReturnResult(data, status, xhr){
         onSuccess(xhr.responseText);
      }

I’m going to put a breakpoint in this function and examine the data, status, and xhr objects that are returned when I run the code. The status object is just a string with the word “success” in it, but data and xhr are much more complex. data is a Document that represents the DOM of the returned XML. xhr has many fields in it, one of which, responseXML, is also a Document. In fact, the debugger tells me that it is the exact same object as data. I can traverse this DOM to get the elements I want. There are native JavaScript ways to do this, but since I’ve already started using jQuery to traverse the web page’s DOM, I’m going to continue to use it to traverse this one.

For example, I can get the element ASIN by searching for element ASIN within Item within Items within the document, using:

         $(data).find("Items Item ASIN")

Actually, though, that will find an array of matching elements which might be empty. To keep it simple at this stage I’m just going to assume that the array has at least one element, and will take the first one as my result. Then I’ll find the text inside that element. That ends up with:

         var asin = $(data).find("Items Item ASIN")[0].textContent;

I’ll change the overall behavior of extractAndReturnResult to pass an object to the success handler instead of just a string, ending up with:

      function extractAndReturnResult(data, status, xhr){
         var result = {
            asin:          $(data).find("Items Item ASIN")[0].textContent,
            author:        $(data).find("Items Item ItemAttributes Author")[0].textContent,
            title:         $(data).find("Items Item ItemAttributes Title")[0].textContent,
            releaseDate:   $(data).find("Items Item ItemAttributes PublicationDate")[0].textContent,
            listPrice:     $(data).find("Items Item ItemAttributes ListPrice FormattedPrice")[0].textContent,
            availability:  $(data).find("Items Item Offers Offer OfferListing Availability")[0].textContent,
            amazonPrice:   $(data).find("Items Item Offers Offer OfferListing Price FormattedPrice")[0].textContent,
            url:           $(data).find("Items Item DetailPageURL")[0].textContent
         };
         onSuccess(result);
      }

Now I have to change the function that gets this result and puts it into the web page, since it’s no longer getting a string. That function used to be an inline function in the main.js file:

                     function(message){
                        message = message.replace(/&/g, "&amp;");
                        message = message.replace(/</g, "&lt;");
                        message = message.replace(/>/g, "&gt;");
                        $("#results").append(message);
                     },

I’m going to change this to refer to a named function called insertResponse, and define that function, shown below:

      function insertResponse(response){
         var html = '<a href="' + response.URL + '">';
         html = html + response.title + '</a> by ' + response.author;
         html = html + ' lists for ' + response.listPrice;
         html = html + ' but sells for ' + response.amazonPrice;
         html = html + '. It was released on ' + response.releaseDate;
         html = html + ' with availability ' + response.availability;
         html = html + '.';
         $("#results").append(html);
      }

It’s verbose, but shows the information. When I look up the same book now, I get a more usable response than before:

JavaScript: The Definitive Guide: Activate Your Web Pages (Definitive Guides) by David Flanagan lists for $49.99 but sells for $31.49. It was released on 2011-05-10 with availability Usually ships in 24 hours.

That’s a good stopping point. There’s still a lot to do before this is a releasable web app. At the very least, I need to check for empty responses and escape any special characters in the data I display. I also want to maintain a list of books, not just look up a single book, and have that list persist between different invocations of this program. So there’s plenty more to come.

December 11, 2011

Chrome Web App Bookshelf – Part 3

Filed under: Uncategorized — Charles Engelke @ 11:21 am

Note: this is part 3 of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent, and part 2 added basic functionality. This part will finally call an Amazon Web Service via JavaScript.

I’ve been approaching this project from the top down so far, starting with creating a nearly empty shell as a Chrome app, then putting in the necessary logic to make it perform a minimal function. The next thing to add is actually calling the Amazon Web Service that looks up an ISBN and returns information about the product. For that, I’m going to switch to a more bottom-up point of view, focusing at first on just that web service call.

I’m going to put the JavaScript for talking to AWS into a separate file called aws.js. That file needs to be loaded into the web page before any file that references it, and after any file it references. I’ll be using jQuery, so the script tags in the main.html page need to look like this:

   <script src="jquery.js"></script>
   <script src="aws.js"></script>
   <script src="main.js"></script>

Within aws.js I’m going to declare a single function that will be used as a constructor for an Amazon Web Services accessing object. That object will have methods to perform the actual calls. Any access to AWS requires credentials. The REST API (which is what I’ll use) requires an access key ID and a secret access key, so I’ll pass those as parameters to the constructor. The overall code will look like this:

var AWS = function(accessKeyId, secretAccessKey){
   var self = this;

   self.itemLookup = function(itemId, onSuccess, onError){
      // code to call AWS Product Advertising API ItemLookup function
   }
}

I could have just used this throughout, instead of defining self as a copy of it, but the JavaScript this variable is kind of tricky in what it references at various times. During the initial call to the constructor it definitely refers to the new object being created, so I’ll save it and use that saved value from then on. This library can be invoked with code like the following (with real access credentials and a sample ISBN in place of the examples):

var aws = new AWS('my access id', 'my secret key');
aws.itemLookup('1234567890',
               function(){alert('it worked');},
               function(){alert('it failed');});

Now, what does that missing code look like? AWS REST API calls use various HTTP methods, but most of them (including this one) just use GET with no special HTTP headers. So if we can build the right URL it will be easy to invoke it. The form of that URL is endpoint?parameters, where endpoint is a web address specific to the API family, and parameters is a normal query string of the form name1=value1&name2=value2&…namen=valuen where the names and values depend on the specific function.

The ItemLookup function I want to use is part of the AWS Product Advertising API. For that API, the endpoint is https://webservices.amazon.com/onca/xml (you can use the http version instead, but I always use the secure version if at all possible). Regardless of the function called, the parameters must always include:

  • Service – the value is always AWSECommerceService for this API
  • AWSAccessKeyId – the accessKeyId part of the credentials
  • AssociateTag – this is a new requirement since November 2011; I’m going to have to add this to either the code, the constructor call, or the method call
  • Operation – the name of the function, ItemLookup in this case
  • Timestamp – when the request was created; AWS will only honor it for 15 minutes to prevent future “replay” attacks
  • Signature – a cryptographic signature created from all the other parameters and the secret access key

The ItemLookup function requires additional parameters:

  • ItemId – identifies the item to find, or a comma-separated list of up to ten items
  • ResponseGroup – tells how much detail we want in the response; I’m going to have to experiment with the various possibilities to see which groups include the data I want

Instead of just creating the query string directly out of these parameters, I’ll use an array of name/value pairs in my code, create the signature from that, then build the query string to use. The code shapes up as follows:

      var params = [];
      params.push({name: "Service", value: "AWSECommerceService"});
      params.push({name: "AWSAccessKeyId", value: accessKeyId});
      params.push({name: "AssociateTag", value: associateTag});
      params.push({name: "Operation", value: "ItemLookup"});
      params.push({name: "Timestamp", value: formattedTimestamp()});
      params.push({name: "ItemId", value: itemId});
      params.push({name: "ResponseGroup", value: "ItemAttributes"});

      var signature = computeSignature(params, secretAccessKey);
      params.push({name: "Signature", value: signature});

      var queryString = createQueryString(params);
      var url = "https://webservices.amazon.com/onca/xml?"+queryString;

This code assumes that a variable named associateTag already exists. I’m going to add it as a parameter to the main constructor function to make that happen. This code also invokes several helper functions: formattedTimestamp, computeSignature, and createQueryString. I’m going to have to write them inside of this library. The code then needs to make an HTTP GET request to that URL and (if the call is successful) pull the desired data out of the response body, passing that to the onSuccess handler.

I’ll tackle the new functions first, from easiest to hardest. formattedTimestamp just needs to return the current time in a standard format: YYYY-MM-DDTHH:MM:SSZ (the T is a separator between date and time, and the Z indicates UTC time). Actually, I could cheat here if I wanted to. I’ve found that any date in the future is accepted by AWS, so I could hard code the result of this function as 9999-12-31T23:59:59Z. But that strikes me as a loophole in the service that may be closed in the future, so I’ll play fair here.

   function formattedTimestamp(){
      var now = new Date();

      var year = now.getUTCFullYear();

      var month = now.getUTCMonth()+1; // otherwise gives 0..11 instead of 1..12
      if (month < 10) { month = '0' + month; } // leading 0 if needed

      var day = now.getUTCDate();
      if (day < 10) { day = '0' + day; }

      var hour = now.getUTCHours();
      if (hour < 10) { hour = '0' + hour; }

      var minute = now.getUTCMinutes();
      if (minute < 10) { minute = '0' + minute; }

      var second = now.getUTCSeconds();
      if (second < 10) { second = '0' + second; }

      return year+'-'+month+'-'+day+'T'+hour+':'+minute+':'+second+'Z';
   }

createQueryString is a bit trickier, but not much. I just need to build a query string in the standard format. However, I have to remember to URI encode the names and values, in case they include any special characters. And I’m going to add the parameters in sorted order by name, because that will be useful later when computing a signature according to AWS’s rules.

   function createQueryString(params){
      var queryPart = [];
      var i;

      params.sort(byNameField);

      for(i=0; i<params.length; i++){
         queryPart.push(encodeURIComponent(params[i].name) +
                        '=' +
                        encodeURIComponent(params[i].value));
      }

      return queryPart.join("&");

      function byNameField(a, b){
         if (a.name < b.name) { return -1; }
         if (a.name > b.name) { return 1; }
         return 0;
      }
   }

This function actually changes the parameter it is passed: it sorts the array it is given. It would be better behaved to make a copy and sort the copy, but instead I’ll just note this fact and keep it simpler.

Now it’s time for the hard one, computeSignature. Actually, with the steps already taken it’s not that hard any more. The AWS signature is a 256-bit SHA HMAC of a special string that includes the HTTP method, the host name of the end point, the path of the request, and the unsigned query string (as created above), signed using the secret access key. Of course, doing that cryptographic operation would be pretty hard, but I don’t have to. I can use the Stanford JavaScript Crypto Library. I downloaded the minified version of it and put it in my project folder in the file sjcl.js, and loaded it in the main page with a script tag before the aws.js reference there. With that in place, the computeSignature function is not too hard:

   function computeSignature(params, secretAccessKey){

      var stringToSign = 'GET\nwebservices.amazon.com\n/onca/xml\n' +
                         createQueryString(params);

      var key = sjcl.codec.utf8String.toBits(secretAccessKey);
      var hmac = new sjcl.misc.hmac(key, sjcl.hash.sha256);
      var signature = hmac.encrypt(stringToSign);
      signature = sjcl.codec.base64.fromBits(signature);

      return signature;
   }

The signing looks more complicated than it is because the hmac.encrypt function operates on bit strings, not normal JavaScript character strings, so there are extra steps to convert those back and forth.

With those preliminaries out of the way the code can create the URL to use to call the service. I’ll use jQuery to make the Ajax call:

      jQuery.ajax({
         type : "GET",
         url: url,
         data: null,
         success: extractAndReturnResult,
         error: returnErrorMessage
      });

This will call the URL and send the successful response to extractAndReturnResult or an unsuccessful one to returnErrorMessage. I’ve got to write those two functions, and then should be done.

      function extractAndReturnResult(data, status, xhr){
         onSuccess(xhr.responseText);
      }

      function returnErrorMessage(xhr, status, error){
         onError('Ajax request failed with status message '+status);
      }

Both these functions need a lot of work! In particular, extractAndReturnResult doesn’t do what its name says at all. It just returns the raw response from Amazon. But that’s going to be useful for exploring the different options on the call, so I’m keeping it that way for now.

Putting all the above together (and adding the necessary associateTag parameter to the constructor), the aws.js file is:

var AWS = function(accessKeyId, secretAccessKey, associateTag){
   var self = this;

   self.itemLookup = function(itemId, onSuccess, onError){
      var params = [];
      params.push({name: "Service", value: "AWSECommerceService"});
      params.push({name: "AWSAccessKeyId", value: accessKeyId});
      params.push({name: "AssociateTag", value: associateTag});
      params.push({name: "Operation", value: "ItemLookup"});
      params.push({name: "Timestamp", value: formattedTimestamp()});
      params.push({name: "ItemId", value: itemId});
      params.push({name: "ResponseGroup", value: "ItemAttributes"});

      var signature = computeSignature(params, secretAccessKey);
      params.push({name: "Signature", value: signature});

      var queryString = createQueryString(params);
      var url = "https://webservices.amazon.com/onca/xml?"+queryString;

      jQuery.ajax({
         type : "GET",
         url: url,
         data: null,
         success: extractAndReturnResult,
         error: returnErrorMessage
      });

      function extractAndReturnResult(data, status, xhr){
         onSuccess(xhr.responseText);
      }

      function returnErrorMessage(xhr, status, error){
         onError('Ajax request failed with status message '+status);
      }
   }

   function formattedTimestamp(){
      var now = new Date();

      var year = now.getUTCFullYear();

      var month = now.getUTCMonth()+1; // otherwise gives 0..11 instead of 1..12
      if (month < 10) { month = '0' + month; } // leading 0 if needed

      var day = now.getUTCDate();
      if (day < 10) { day = '0' + day; }

      var hour = now.getUTCHours();
      if (hour < 10) { hour = '0' + hour; }

      var minute = now.getUTCMinutes();
      if (minute < 10) { minute = '0' + minute; }

      var second = now.getUTCSeconds();
      if (second < 10) { second = '0' + second; }

      return year+'-'+month+'-'+day+'T'+hour+':'+minute+':'+second+'Z';
   }

   function createQueryString(params){
      var queryPart = [];
      var i;

      params.sort(byNameField);

      for(i=0; i<params.length; i++){
         queryPart.push(encodeURIComponent(params[i].name) +
                        '=' +
                        encodeURIComponent(params[i].value));
      }

      return queryPart.join("&");

      function byNameField(a, b){
         if (a.name < b.name) { return -1; }
         if (a.name > b.name) { return 1; }
         return 0;
      }
   }

   function computeSignature(params, secretAccessKey){

      var stringToSign = 'GET\nwebservices.amazon.com\n/onca/xml\n' +
                         createQueryString(params);

      var key = sjcl.codec.utf8String.toBits(secretAccessKey);
      var hmac = new sjcl.misc.hmac(key, sjcl.hash.sha256);
      var signature = hmac.encrypt(stringToSign);
      signature = sjcl.codec.base64.fromBits(signature);

      return signature;
   }
}

The main.js file needs a little tweaking now to call this properly. The new version is:

$(document).ready(function(){
   var aws = new AWS('my access id', 'my secret key', 'my associate id');
   $("#lookup").click(lookupIsbn);

   function lookupIsbn(){
      var isbn = $("#isbn").attr("value");
      aws.itemLookup(isbn,
                     function(message){
                        message = message.replace(/&/g, "&amp;");
                        message = message.replace(/</g, "&lt;");
                        message = message.replace(/>/g, "&gt;");
                        $("#results").append(message);
                     },
                     function(message){
                        alert("Something went wrong: "+message);
                     }
                     );
   }
});

There are only two real changes here. First, the constructor is called first, to get an object for working with AWS before anything else happens. Second, instead of just dumping the response message in the web page the code first replaces all special HTML characters with their equivalent character entities. That way, the message will be shown as text instead of interpreted as HTML, possibly including code.

And now I’m ready to go. I put an ISBN in the text box and pressed the button… and got this:

Error message: Something went wrong: Ajax request failed with status message error

That doesn’t tell me much, though. But the JavaScript console (opened with Control-Shift-J) is more helpful:

Origin is not allowed by Access-Control-Allow-Origin

Web browsers do not allow pages to make requests to other addresses, so this request was disallowed. That’s a security restriction. There is a new Cross-Origin Resource Sharing specification that allows this when the target web site decides it is safe to do, but AWS doesn’t support it. Not yet, anyway; I’m still hoping. However, Chrome apps can bypass this restriction if they ask. The manifest.json file needs to be changed to request this:

{
   "name": "Books to Buy",
   "description": "Keep a list of books to buy on Amazon, with their price and availability",
   "version": "1",
   "app": {
      "launch": {
         "local_path": "main.html"
      }
   },
   "icons": {
      "16":    "icon_16.png",
      "128":   "icon_128.png"
   },
   "permissions": [
      "https://webservices.amazon.com/*"
   ]
}

The permissions entry tells Chrome to allow this app to make requests to any URL matching the wild card given. I removed and reinstalled the app after making this change, and tried again. And got this:

Page showing XML response from AWS

Success! Sort of. A lot of XML came back from the request, and I need to pull the necessary data out of it. I also need to explore various response groups to get the data I need. And all that will be the subject of the next post in this series.

December 6, 2011

Chrome Web App Bookshelf – Part 2

Filed under: Uncategorized — Charles Engelke @ 11:54 pm

Note: this is part 2 of the Bookshelf Project I’m working on. Part 1 was a Chrome app “Hello, World” equivalent.

Now that we can build a web page and install it as an app in Chrome it’s time to make the page do something. Ideally, something to do with Amazon Web Services. This project is going to work by incrementally adding features, and I will start small. I want a page that has a field to enter an ISBN and a button to ask it to be looked up at Amazon. Information about the matching book (or an error message if there isn’t one) will be displayed below that in the page. While I’m at it, I’ll also change the page title and add a header explaining what the page is.

The new page is still quite simple:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="utf-8" />
   <title>Books to Buy</title>
</head>
<body>
   <h1>Books to Buy</h1>
   <div id="dataentry">
      <input type="text" id="isbn" />
      <button id="lookup">Look Up!</button>
   </div>
   <div id="results">
   </div>
</body>
</html>

If you have already created and installed the basic web app, you can edit the main.html file to match this. When you next run or refresh the app, you should see something like the following:

Books to Buy page, first try

Of course, if you enter an ISBN and press the button nothing happens. I have to write JavaScript to respond to the button press, call an AWS API to look up the information, parse the information, and place it into the empty results div.

I don’t like to put JavaScript in my web pages directly, so I’ll create a separate file for it and load it by putting a script tag right after the title. There are people who argue for placing script tags at the very end of a page for performance reasons but I don’t see it making much difference here, and I still like them near the top. I’ll put the JavaScript code in a file called main.js, and add a line right after the title tag:

   <script src="main.js"></script>

Since this page is HTML5 (thanks to the <!DOCTYPE html> declaration at the top), I don’t need to specify that this is a JavaScript file; HTML5 assumes all script files are. I don’t use a self-closing tag because that often (maybe always) doesn’t work for reasons I don’t understand.

After saving the file and hitting refresh, nothing looks different. Because the new JavaScript file doesn’t yet exist. I brought up the Chrome Developer Tools by hitting Ctrl-Shift-J, and saw this error message in the console:

chrome-extension://jpnlfejeoenacfaonfmmdiofnheemppo/main.js Failed to load resource

By the way, from this I see that the browser refers to my app with a URL starting with chrome-extension:// followed by an apparently randomly assigned string. I don’t know how that will be useful, but it’s interesting.

I need to create a main.js file in the same folder as the main.html file, and put code in it to:

  • Attach an event handler to the button, so that when a user clicks it my code will run.
  • Have that code read the ISBN from the input text box.
  • Call the AWS service to look up the information for that ISBN.
  • If the call works, pull the necessary data out of the response and display it in the results div.
  • If the call fails, either put an error in the div or pop up an alert box.

I’m going to use jQuery to help with this work. That’s a JavaScript library that adds a lot of useful features to JavaScript, and which handles subtle variations between how different browsers implement JavaScript. That second benefit is less important with HTML5, which causes browsers to behave much more consistently than ever before, but I’m used to jQuery and want to use it. I have to download it (either the compressed or uncompressed one will work) and put a copy of it in the same directory as the main web page. I’ll call that downloaded file jquery.js and I’ll add a script tag for it just before the main.js script tag:

   <script src="jquery.js"></script>

Now, what goes in the main.js file? The first thing to do is to attach an event handler to the button’s click event. That’s easy with jQuery:

   $("#lookup").click(lookupIsbn);

The $ is actually a jQuery JavaScript function name (it’s an alias for a function named jQuery). If you give it a string with a CSS selector (which #lookup is, referring to the element with the id lookup) it will return a jQuery object referring to that element, which has added to it a lot of useful methods. One of the methods it adds is click, which takes a function as a parameter. In this case the code is passing a function called lookupIsbn, which means that function should be invoked whenever anybody clicks that button.

There are two problems with this line. The first is pretty obvious: it says to run a function called lookupIsbn but there is no such function. Not yet. I’ll write it soon. The second is more subtle. The browser will execute the JavaScript as soon as possible, which may be before the web page has been fully read and processed. So there may not be an element with id lookup when this code runs and nothing will happen. Or maybe the timing will work out okay and this will do what I want. That would actually be worse because then the code would randomly succeed or fail. I’d rather have consistent behavior, even if that’s consistent failure.

The browser builds a data structure for each page as it reads it, starting with the document element that contains everything else. When it finishes building the page it triggers an event handler on that document element. So I can set up that event to run this code, making it run once the page is ready. jQuery makes that easy by adding a ready method to the document element when we wrap it. So the code should be:

$(document).ready(attachClickHandlerToButton);

function attachClickHandlerToButton(){
   $("#lookup").click(lookupIsbn);
}

In fact, though, people rarely define a named function (like attachClickHandlerToButton) to deal with an action that will happen only once. Instead, they define an anonymous function in place, as follows:

$(document).ready(function(){
   $("#lookup").click(lookupIsbn);
});

I could use the same trick in place of lookupIsbn, but I get uncomfortable when I nest anonymous functions too deeply. The browser’s fine with it, but I’m not. So I have to write lookupIsbn now. That can be defined after the JavaScript above, but I’d rather define it inside the anonymous function, like so:

$(document).ready(function(){
   $("#lookup").click(lookupIsbn);

   function lookupIsbn(){
   // put the code here
   }
});

This prevents any JavaScript code outside of the anonymous function from seeing or using the lookupIsbn function. I don’t much care if they could use it, but if some other code (perhaps in an included third-party library) used the same function name things would get troublesome. This keeps my function definition private and avoids interference with other code.

What goes in there seems pretty straightforward. I have to read the ISBN from the input box, ask AWS for information about that ISBN, parse the result into a readable form, and then display it. That would be something like:

      var isbn = $("#isbn").attr("value");
      var message = askAwsAboutIsbn(isbn);
      $("#results").append(message);

The first line finds the element with id isbn, which is the input element, gets the contents of the value attribute (which is what the user entered), and saves it in a new variable named isbn. The second line magically asks AWS for information, presumably getting a nicely formatted chunk of text back. The last line finds the element with id results and puts an element built from the message text inside of it.

There are some things wrong here. If the message coming back from AWS has HTML inside of it this code will insert it directly into the page, which might end up even running code. I’ve got to fix that. But a bigger problem is the magic askAwsAboutIsbn function call. My code calls the function, waits for a response, then uses the result. But that’s going to involve talking to a remote web site, which is relatively slow. My web page is going to be frozen while waiting for that answer.

The way to handle this freeze is to make the request asynchronous. That is, call askAwsAboutIsbn to get the answer, and give it a function to call when it’s done. Then immediately return instead of waiting for the answer. To do that, the magic askAwsAboutIsbn has to be told not only what ISBN to look up, but also given a function to execute when it’s done. So the code should look something like this:

      var isbn = $("#isbn").attr("value");
      askAwsAboutIsbn(isbn, function(message){
         $("#results").append(message);
      });

This magic box will get the answer without the main program waiting for it. When it has it, it will call the anonymous function, passing the message it got to it, and that function will put the message in the right place on our page. So all that’s left is to write askAwsAboutIsbn. But that’s a pretty tall order, so I’m going to leave it for next time. For now, I’ll just write a stub that returns a canned response:

      function askAwsAboutIsbn(isbn, handleResult){
         handleResult("I don't know the info for "+isbn+" yet.");
      }

This stub just immediately calls the function it was given with a canned response as the parameter. Its purpose is just to see if everything is wired together right.

Putting everything together, there is now a folder called bookshelf that contains six files: main.html, main.js, jquery.js, icon_128.png, icon_16.png, and manifest.json. I haven’t changed the last three of those files and I just downloaded jquery.js. The other two files are the main.html file:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="utf-8" />
   <title>Books to Buy</title>
   <script src="jquery.js"></script>
   <script src="main.js"></script>
</head>
<body>
   <h1>Books to Buy</h1>
   <div id="dataentry">
      <input type="text" id="isbn" />
      <button id="lookup">Look Up!</button>
   </div>
   <div id="results">
   </div>
</body>
</html>

and the main.js file:

$(document).ready(function(){
   $("#lookup").click(lookupIsbn);

   function lookupIsbn(){
      var isbn = $("#isbn").attr("value");
      askAwsAboutIsbn(isbn, function(message){
         $("#results").append(message);
      });
   }

   function askAwsAboutIsbn(isbn, handleResult){
      handleResult("I don't know the info for "+isbn+" yet.");
   }
});

If all the files are right the application should work when I enter an ISBN and click the button. And it does:

First version of page showing the result

That’s enough for this post. Next time I’ll actually use the AWS web service to look the information up for the given ISBN, parse the result, and display it. There will be plenty to do after that, though: saving data persistently, improving the display, adding a settings page, and packaging the app. So there’s a lot more to come.

December 5, 2011

Chrome Web App Bookshelf – Part 1

Filed under: Uncategorized — Charles Engelke @ 10:44 am

Note: this is part of the Bookshelf Project I’m working on.

Before I can do anything useful in a Chrome Web Application, I’ve got to figure out the very basics. This post is going to cover my “Hello, World” equivalent start, maybe going a bit deeper than that.

At a minimum, my Chrome web app needs four things:

  1. A web page to display and run.
  2. An icon to show on the Chrome applications page.
  3. A favicon.
  4. A manifest describing where the above pieces are, plus anything else I end up needing.

I started by creating a folder to put all these pieces in. I named it bookshelf, but it could have been called anything.

My first web page is going to be just about the bare minimum for HTML5:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="utf-8" />
   <title>A sample page</title>
</head>
<body>
   <p>This is the sample page.</p>
</body>
</html>

I put that text into a file called main.html in my folder, then went searching for icons. I found a very nice one, in a variety of sizes, at designmoo.com. It’s called Book_icon_by_@akterpnd, and is licensed under a Creative Commons 3.0 Attribution license, so there should be no problems with my using it here. I need a 128 by 128 icon for the application page, and 16 by 16 for the favicon. The set didn’t have a 16 by 16 icon in it, so I resized the smallest one for that. I called the two icons I ended up with icon_128.png and icon_16.png, and put them in my folder, too. They look pretty good, don’t they?

128 by 128 icon for project16 by 16 icon for project

Now I have to write the manifest file. It’s in JSON format, which is just text in a specific syntax. With a text editor I created mainfest.json in my folder, with the following content:

{
   "name": "Books to Buy",
   "description": "Keep a list of books to buy on Amazon, with their price and availability",
   "version": "1",
   "app": {
      "launch": {
         "local_path": "main.html"
      }
   },
   "icons": {
      "16": "icon_16.png",
      "128": "icon_128.png"
   }
}

You can see that this file points my three other files, and also gives the app a name, description, and version. I don’t know what the best practices are for the version numbering, so for now I’ll just keep it at 1.

I think I have a complete Chrome app now. You can create a folder with these files to see for yourself. Once you have these four files in a folder, open Chrome and click the wrench icon in the upper right to get the menu. Select Tools, then Extensions. Check the “Developer Mode” box on the resulting page if you haven’t already, then click the Load Unpacked Extension button. Select the folder with the manifest in it, and you should be good to go. It should look like this:

Chrome extensions page showing the new app

Pretty nice, I think. Go ahead and close this tab. To run the app (with the version of Chrome current as I write this), open a new tab and click Apps on the bar at the bottom of the resulting page. The new app should show up at the end of the page:

New tab page showing new app

Click on the application icon, and the page should open, and even have the right favicon:

The web app, opened

Okay, that’s not much, but this post is the “Hello, World” equivalent. Next time we will add a skeleton for a minimal application, one where you can enter an ISBN and have the page look it up and display the result in the page.

December 4, 2011

The Bookshelf Project – Using Amazon Web Services from JavaScript

Filed under: Uncategorized — Charles Engelke @ 8:08 pm
Tags: , , ,

Many years ago, I got frustrated with using Amazon’s “save for later” shopping cart function to keep track of books I probably wanted to buy someday. The problem I was trying to solve was that I’d find out about an upcoming book by one of my favorite authors months before publication and I didn’t want to forget about it. I could have just preordered the book, but back then there was no Amazon Prime so I always preferred to bundle my book orders to save on shipping. So I’d add the book to the shopping cart and tell it to save it for later. But (at least back then) Amazon was willing to save things in your cart for only so long, and my books would often disappear from the cart before they were published.

I’m a programmer, and Amazon had an API (application program interface), so I did the obvious thing: wrote a program to solve my problem. It was just for me, so I wrote the simplest thing that could possibly work, figuring I’d improve it some day. It was a simple Perl CGI script that I ran under Apache on my personal PC. It used the (then very primitive) Amazon Web Service to look up the book’s information given an ISBN, and saved its data in a tab delimited text file.

That was a long time ago, probably very soon after Amazon introduced its first web services. And I’m still using it today with almost no changes. But I’m no longer happy with it, for several reasons:

  • It only recognizes the old 10 digit ISBN format, not the newer 13 digit one.
  • It can’t find Kindle books at all.
  • It runs only on a PC running an Apache webserver.
  • The data is available on only that device.

The cloud has spoiled me. I want this program to run on any of my web-connected devices, and I want them all to share a common data store. Hence this project.

“Run on any of my web-connected devices” pretty much means running in a browser, so I’ll have to write it in HTML and JavaScript. I’ll use HTML5 and related modern technologies so I can store data in the browser so I can see my saved book list even when off-line.

I know HTML and JavaScript but I’m no expert, so I’m going to build this incrementally, learning as I go. Step 1 will be to get a web page that just uses Amazon Web Services (AWS) to look up the relevant information given an ISBN. And right away, that’s going to require a detour. As a security measure, web browsers won’t let a web page connect (in a useful enough way) to any address but the one hosting the web page itself. My web page isn’t going to be at the same address as AWS, so it seems this is a hopeless task.

There is a way out, called Cross Origin Resource Sharing (CORS). The target web site can tell the web browser that it’s okay, it’s safe to let a “foreign” web page access it. Modern browsers support CORS, so I should be okay. Unfortunately, AWS doesn’t (yet) support CORS, so that’s out. Foiled again!

But there is a stopgap. I can create a Chrome Web Application. That’s pretty much just a normal web page, except that it can tell the web browser to allow access to foreign services. And that’s just what I will do, starting in my next blog post. That will take a while, but after that’s done, I can explore various directions to take it:

  • Maybe AWS will support CORS soon, in which case I’ll be able to use almost the exact same solution on any modern web browser, even on tablets and phones.
  • I can always write server-side code to “tunnel” the web service requests through my server on the way to AWS. That works, but I think it’s inelegant.
  • I might try creating an HP TouchPad application, which uses the same kinds of technologies as the web, but to create native apps. I find that approach very appealing, even though the TouchPad is more-or-less an orphan device now. I’ve got one, and this would be an excuse to develop for it.
  • Tools like PhoneGap let you wrap a web application in a shell to allow it to run as a native app on various mobile platforms. I think they allow operations that normal browsers block, such as CORS. I could find out, anyway.

So I’ve got a lot of potential things to learn and try. First up: creating a Chrome web application, in many steps. If it comes out nice, I’ll even try publishing it in the Chrome Web Store.

November 28, 2011

HTTP Strict-Transport-Security

Filed under: Uncategorized — Charles Engelke @ 3:28 pm
Tags: ,

I figured it would be easy add HTTP Strict-Transport-Security to a web application, so I gave it a try. It was easy to add it. But not that easy to get it to work.

The purpose of Strict-Transport-Security is for a web site to tell browsers to only connect to the site via a secure connection, even if the user just enters a normal http:// URL. The site I was working on already redirected all http requests to a secure connection, so what’s the point of this here? In this case, just to avoid a potential (but for this site, unlikely) man-in-the-middle attack when the user first connects. That non-secure initial request could be intercepted and the user redirected to some other site that looks right, but is an imposter. With Strict-Transport-Security, the browser will never even make the initial non-secure request, avoiding this possibility.

Adding this to a web site is trivial: just return a special header with the secure web pages. For example, I returned:

Strict-Transport-Security: max-age=7776000

Once the browser sees this header from any secure page on your website, it is supposed to remember (for the next 7,776,000 seconds, or 90 days) to never try to connect to the site other than by a secure connection. It’s also supposed to prevent the user from overriding any SSL certificate warnings, so if somebody does spoof your website with a bad certificate it won’t even give the user a chance to override the warnings and connect to the site.

Only it didn’t do that. It didn’t do anything at all that I could tell. I had a self-signed certificate, and the browser let me override the warning. I had a certificate with a name not matching the URL, and the browser let me override the warning. I tried to connect via http instead of https, and the browser went ahead and did it. (By the way, by “browser” I mean Chrome, but Firefox and Opera behaved the same way.)

It turns out that the browser will only obey this header if it is sent from a secure web page (as clearly stated in the documentation) that has no certificate warnings or errors (something I didn’t realize). Once I set up my test site to appear to be at the production URL, the Strict-Transport-Security header started working as expected.

I wasn’t expecting it to work this way, but it turns out to be really useful behavior. Now I can deploy test and development sites with wrong certificates (self-signed, or for the production instead of test URL) and not have Strict-Transport-Security lock me out. It only activates the lock in the first place if you’ve already shown you can open it, by first connecting to the secure site with no certificate problems. Now I just have to be careful to check that it does work when put into production.

November 17, 2011

Kindle Fire out of box experience #kindle

Filed under: Uncategorized — Charles Engelke @ 8:37 am
Tags: , ,

My new Kindle Fire was waiting for me when I got home last night. So far, I’m very impressed.

It was packaged in a custom cardboard shipping box, opened by peeling off a well-marked strip. Once opened, there were only three things in the box: the Kindle Fire itself, a micro-USB power supply, and a small card welcoming the user and telling how to turn it on. The Kindle Fire was in a plastic wrapper that was a bit hard to slide off, though I could have just torn it off if I’d been in a hurry.

I guess I hit the power button while I was removing the plastic wrapper, because once I got the Fire out it was already turned on. I had to drag a ribbon (from right to left for a change) to get the first welcome screen to show.

What a contrast to when I turned on an iPad for the first time! The iPad just showed me an icon ordering me to connect it to a PC (which also required downloading and installing iTunes). With the Kindle Fire, I was just taken through a short dialog. I was first prompted to connect to a Wi-Fi network. The unit showed me available ones, I picked mine, and entered the password when prompted. The I went to a registration screen, which in my case didn’t require any effort at all because Amazon had already set it up. It then started downloading a software update and suggested I plug it in to get a full charge. I don’t know why a brand new unit should need a software update, but this was only a minor annoyance.

It was already about 90% charged, but I plugged it in anyway and waited the couple of minutes the download required, and then got back to the unit. And that was it, the Kindle Fire was ready to use, and registered with Amazon. All my books and music were available immediately (when I clicked the Cloud button instead of the Device button), as well as several apps.

I opened my current book and it took only a few seconds to download it to the device and open it to the current page I was reading. Amazon Prime video played back perfectly, as did my music I already had in Amazon’s cloud. I installed Netflix and entered my credentials, and it played back great as well.

Oh, I also installed the Barnes and Noble Nook application. That wasn’t in Amazon’s app store (go figure) but was easy to get using GetJar. It works great, too. Though I’m unlikely to actually buy any non-free books with it, because why would I? Thanks to the apparent collusion between Apple and the major book publishers, most book prices are fixed and cost the same regardless of seller. I like Amazon’s ecosystem, so there’s no reason to deal with anybody else.

How do I like the Kindle Fire as a tablet? It’s too early to tell much, but the smaller form factor is definitely better for me than the iPad or Galaxy Tab 10.1. It’s easy to hold it in one hand while using it, and the display is plenty large enough to use well. I think this form factor is going to become much more common than the larger ones.

August 23, 2011

SSL Client Authentication with Apache and Subversion

Filed under: Uncategorized — Charles Engelke @ 3:26 pm

Setting up Subversion with the Apache web server is pretty easy. Setting it up with SSL is still not too difficult. But I’ve been trying to do that plus require the Subversion client to authenticate to the server using an SSL client certificate. That’s not so easy. I’m not quite there yet, but I’m close and don’t want to forget what I’ve done so far, so I’m documenting it here. When I’m completely done I’ll update this post with the final steps.

First I needed a server.

I chose a Linux server on Amazon Web Services Elastic Compute Cloud. The one I chose is currently the second one in the list, “Basic 64-bit Amazon Linux AMI”, and I launched it as a micro instance since Subversion shouldn’t require much in the way of computing power. This kind of server costs only two cents an hour to run, or less than $15 per month.

Next, I logged on to the server, became root, and used yum to install the basic software needed:

   yum -y install httpd subversion mod_ssl mod_dav_svn

And started the web server:

   service httpd start

Now I had a working web server, both over HTTP and HTTPS (SSL). When I tried to connect to the SSL server my browser warned me that I shouldn’t trust it; that’s because it has a self-signed SSL certificate that mod_ssl installed by default. I may fix that later by buying a commercial certificate, but maybe not. I know the certificate is okay because I installed it myself.

Step 3 is to get Subversion working.

That’s a two part operation: set up an area for the repositories (and set up the repositories in it), and then configure Apache to know about that area.

Part 1: set up the area and the first repository (still as root):

   mkdir /var/www/svn
   svnadmin create /var/www/svn/project1
   chown -R apache:apache /var/www/svn/project1

I’m not a Linux sysadmin so I don’t know whether /var/www/svn is a good place for this, but it works. The chown command is important because the Subversion operations will be performed by the Apache web server, so it needs permission to operate on the repository. By default, these repositories are readable by anybody on the server machine but can only be written to by the Apache web server.

Part 2: configure Apache:

For the Linux flavor I’m using the web server configuration files are in /etc/httpd. The main configuration file is /etc/httpd/conf/httpd.conf and the SSL server configuration is /etc/httpd/conf.d/ssl.conf. There’s also a Subversion configuration file in /etc/httpd/conf.d/subversion.conf but it doesn’t seem to be up to date with changes in Apache web server version 2.2.

In any case, I thought I needed to add (or uncomment) a line in the LoadModule section of the main configuration file, but it was already included automatically as part of the subversion.conf file:

   LoadModule dav_svn_module modules/mod_dav_svn.so

That line is the module that actually provides Subversion functionality to the Apache web server. There will be other lines needed for user authentication which will have to be added once I’m done getting client certificate authentication working right.

Next, I had to add a Location section to the SSL server configuration file. Let’s say I want my Subversion repositories to be accessed via https://my.host.name/svn/repositoryname. I added the following section the SSL configuration file:

   <Location /svn>
      DAV svn
      SVNParentPath /var/www/svn
   </Location>

When I restarted the web server (with service httpd restart) I had a working Subversion server over SSL. I tried it from my PC:

   svn checkout https://yourserver/svn/project1 project1

As with the web browser, I was warned about the server’s self-signed certificate, but can choose to connect anyway. Since I haven’t set up any authentication yet, the checkout didn’t ask who I was. I even added a file and checked it in for practice, then pointed my web browser to https://my.host.name/svn/project1 to see the file there. It worked fine. Now to mess that all up!

Step 4: Configure the server to demand a client certificate.

First I just want to get the web server to demand a good certificate from the client. Once that works I’ll look into using the identification in that certificate to control access to Subversion.

Inside the <Location> section for Subversion, I added three lines as follows:

   <Location /svn>
      DAV svn
      SVNParentPath /var/www/svn
      SSLVerifyClient require
      SSLVerifyDepth 10
      SSLCACertificateFile /etc/httpd/conf/myca.crt
   </Location>

The SSLVerifyClient line tells Apache to require the connecting client to authenticate the SSL connection using an acceptable certificate; the other two lines tell Apache how to make sure the certificate is acceptable. The SSLVerifyDepth directive says that it will follow a chain of certificates up to 10 deep if needed, and the SSLCACertificateFile is the location of the Certificate Authority root file you want the certificates to have been issued by. Following a chain 10 deep is probably overkill, but won’t hurt anything.

I’ve specified that my CA root certificate is in /etc/httpd/conf/myca.crt. That’s not a standard location for certificate authorities, but eventually I’ll want to use a private certificate authority just for this purpose, and not rely on public ones.

When I try to restart the web server with service httpd restart it fails because /etc/httpd/conf/myca.crt doesn’t exist. So, just to get past this step for the moment, I copied my server’s certificate there:

   cp /etc/pki/tls/certs/localhost.crt /etc/httpd/conf/myca.crt

Now when I tried to start the server it works. But when I went to my working copy on my PC and tred to do an svn update I couldn’t. I was prompted for a client certificate. Which I don’t have yet.

I tried viewing the repository with a web browser, and that didn’t work, either. But at least Chrome gave me a hint:

Text of Chrome error message

So it was time to get a client certificate. Eventually I want my own private certificate authority, but first I just want to get this working in a minimal way today. I went to Verisign and got a free trial “Digital ID for Secure Email” at www.verisign.com/digital-id/index.html. (I used Internet Explorer to request and fetch the certificate.)  Then I exported it (including the private key) to the file testcert.pfx. I had to make up a password for it to export it.

I tried an svn update in my working copy again, and this time when I was prompted for a certificate I gave it that file. And got a new error message:

   C:\temp\project1> svn up
   Authentication realm: https://my.host.name:443
   Client certificate filename: testcert.pfx
   Passphrase for 'C:/Temp/project1/testcert.pfx': ********
   svn: OPTIONS of 'https://my.host.name/svn/project1': Could not
   read status line: SSL error: tlsv1 alert unknown ca (https://my.host.name)

Since the Subversion client is pretty complex in its own right, I decided to start debugging with a browser first. But it showed the same error message as before. I think I’m going to need to set up a correct CA root certificate.

I got the root certificate out of the Windows certificate store. When I examined the digital ID in that store via Control Panel / Internet Options / Content / Certificates / Personal, the certification path shows it is signed by “VeriSign Class 1 Individual Subscriber CA – G3″, which in turn is signed by “Verisign Class 1 Public Primary Certification Authority – G3″. I looked for the first one in the Intermediate Certification Authorities store and exported it (in the base 64 format) to C:\Temp\intermediateca.cer., uploaded it to the server and put it in /etc/httpd/conf/myca.cer. After restarting the server, I tried again in the browser.

Success! It asked me for a certificate, and showed me the one I had installed from Versign. I clicked okay, and… Failure! Same error message as before.

So I repeated the steps, but with the second, full root certificate from the Trust Authorities Windows certificate store. And this time, it worked! In Chrome. Not in Internet Explorer at first; I had to close all of its windows and start over. But the Subversion client still doesn’t work.

Step 5: Configure the Subversion client

The problem turns out to be that the Subversion client doesn’t trust the testcert.pfx certificate I provide. This doesn’t make any sense to me; why would I provide the certificate if the client shouldn’t trust it? It’s the server that needs to know it’s trustworthy, not the client.

Well, it doesn’t matter what I think, the Subversion client isn’t going to authenticate with the server unless is decides that the certificate was issued by a trusted certificate authority. So I had to provide a root certificate and tell the Subversion client where to find it.

I’ll cut to the chase. The configuration file that needs to be edited in C:\Users\myname\AppData\Roaming\Subversion\servers. There’s a commented out line in it like the following:

   # ssl-authority-files = /path/to/CAcert.pem;/path/to/CAcert2.pem

I uncommented it and changed it to:

   ssl-authority-files = /Temp/intermediateca.cer

(Recall that I exported that file from the Windows certificate store in the previous step.) Then I tried using the Subversion client to do an update, and it finally worked. Unlike the server, it wanted the immediate parent certificate of my client certificate, not the ultimate root.

I still think requiring the root certificate on the client software is odd, but it turns out that Firefox works the same way. If I want to browse my Subversion repository I need to import both the client certificate and the root certificate to Firefox first.

What’s next?

The client is authenticating with the certificate, but it will accept any certificate from VeriSign right now, not just the ones I specifically want. And the repository doesn’t know who is authenticating; the Subversion “blame” listing leaves the user blank. I’ll look into dealing with both those issues in a future post.

July 23, 2011

Pictures from our Arctic Circle Cruise

Filed under: Uncategorized — Charles Engelke @ 12:02 pm
View of Longyearbyen, Norway from above

Overlooking Longyearbyen

Now that I’m home and have decent Internet access, I’ve posted pictures from our Celebrity Constellation cruise to the Arctic Circle. We visited three Norwegian ports north of the Arctic Circle: Leknes (in the Lofoten archipelago), Honningsvåg (near the north cape and the Gjesværstappan bird sanctuary), and Longyearbyen (above, on Spitsbergen in the Svalbard archipelago).

We also visited two Norwegian ports south of the Arctic Circle: Bergen and Ålesund (where we took a bus tour inland along the Path of the Trolls). And we spent several days before the cruise in Amsterdam, from which we took a day trip to the Hague.

It was all beautiful, and now I have an idea of the difference in degree between different arctic areas. All our visits were to seaside areas, but Leknes (at 68° 08′ N) is coastal and lush…

Leknes

Leknes

… while the North Cape area around Honningsvåg (at 70° 58′ N) is much more sparsely vegetated…

House near Honningsvåg

House near Honningsvåg

… and Longyearbyen (all the way up at 78° 13′ N) is very severe, not only lacking trees but really without any plants more than few inches tall…

Guide at Longyearbyen

Guide at Longyearbyen

… and with polar bears, which is why our guide was armed with a rifle. Unfortunately, we didn’t get to see a polar bear; fortunately, we didn’t get attacked by one, either.

July 5, 2011

Amsterdam Art in the Street

Filed under: Uncategorized — Charles Engelke @ 3:53 pm

Our hotel here in Amsterdam is on Apollolaan, which is part of Artzuid – International Sculpture Route Amsterdam. We were surprised when we got off the tram on the way in from the airport to encounter a man riding a giant golden armored turtle:

Scupture

Searching for Utopia - Jan Fabre

As we continued the one-block walk, we passed by what seems to be an elephant on stilts:

Sculpture

Space Elephant - Salvador Dali

I really liked this kinetic sculpture. It only runs for five minutes each hour, but we just happened by as it was going:

Sculpture

Heureka - Jean Tinguely

I’ve got a one-minute movie of that sculpture, too. Maybe I’ll post it, if I ever figure out YouTube.

This afternoon we tried a different kind of Amsterdam art: microbrewed beer from a 300 year old windmill, Brouwerij ‘t IJ:

Windmill

Brouwerij 't IJ at Windmill

I can confirm that at least three of their beers, especially the Columbus, are also works of art.

July 1, 2011

Relative Performance in Amazon’s EC2

Filed under: Uncategorized — Charles Engelke @ 11:58 am
Tags: , ,

I’ve been using Amazon’s Elastic Compute Cloud for several years now, but a lot more lately. And one thing that has always confused me is the relative benefits of using Elastic Block Store (EBS) versus instance store.  I’ve seen some posts on this, but they all set up sophisticated RAID configurations. What about some simpler guidance for a regular developer like me?

Well, I don’t have the answers, but I have a little bit of new data. I’m updating a site that starts by loading a 1.25GB flat file into MySql, then creating three indexes, then traversing that table to create a second, much smaller table.  Dealing with those 10 million rows is pretty slow, so I decided to see what difference it made using EBS or the instance store. While I was at it I tried different size machines. The results, shown in minutes to complete the task, are summarized in the table below:

Size  EBS  Instance
t1.micro 635
m1.large 56 66
m2.xlarge 47 49
m2.4xlarge 42 40
c1.xlarge 49 49

The t1.micro machine size is only available in EBS, and it got about 90% of the way through (finished creating all three indexes) then died.

This seems to show that (for this kind of operation) EBS performed noticeably but not enormously better than the instance store, but the difference shrank as available memory increased. Also, “larger” machines didn’t help much once there was enough memory available. Not surprising, since this is a single-threaded operation.

June 7, 2011

Web Resilience with Round Robin DNS

Filed under: Uncategorized — Charles Engelke @ 11:42 am
Tags: , , ,

I haven’t been able to find a lot of hard information on whether using round robin DNS will help make a web site more available in the face of server failures.  Suppose I have two web servers in geographically separate data centers for maximum robustness in the face of problems.  If one of those servers (or its entire data center) is down, I want users to automatically be directed to the other one.

At a low level, it’s clear that round robin DNS wouldn’t help with this.  In round robin DNS you advertise multiple IP addresses for a single name.  For example, you might say that “mysite.example.com” is at both addresses 192.168.2.3 (the server in one data center) and 192.168.3.2 (the server in the other data center).  But when a program tries to connect to mysite.example.com it first asks the network API to give it the IP address for that name, this returns just one of those IP addresses, and the program uses that to connect to it.  If that address happens to be for an unavailable server, the program’s request to the server will fail, even if there is a healthy server at one of the other addresses.

Of course, if you write the client program, you can make it work in this situation.  You’d have your program ask the network API for all of the IP addresses associated with a name, and then your program can try them one at a time until it gets something to connect.  But in the case of a web site, you haven’t written the client program.  Microsoft or Mozilla or Google or Apple or Opera or some other group wrote it.

Wouldn’t it be great if those web browsers all worked that way?  Wouldn’t it be great if you could find clear indications that they worked that way?

As it happens, it appears that the all do work that way, even though I have found it very hard to get clear confirmation that they are supposed to.  I’ve found a few web pages that talk about browsers performing “client retry”, but not any kind of specification or promise.  I’ve found many more pages saying to forget about using round robin DNS for this, and to use a load balancer or some other kind of proxy to distribute web requests to available servers.  The problem with that is that you now have a new single point of failure (the load balancer) at a single location.  It can be made very reliable, but can still fail and leave your users unable to connect.  You can change your DNS entry to point to a new location, but that takes time to propagate (even longer than your DNS server says it should in the case of internet service providers who cache those addresses more aggressively than they should).  There are routing protocols to force traffic for a specific IP address to a different location, but they’re too complicated for me and require a lot of low level routing privileges that we can’t expect to have.  No, round robin DNS with clients smart enough to try each address if they need to, would be a real help here.

Since I couldn’t get clear indications that this would work where I need it to, I set up a simple experiment to see how web browsers respond in this situation.  I created web servers in Amazon’s Virginia and California regions, each returning a single web page.  The one in Virginia returns a page saying “Eastern Server”, and the one in California returns a page saying “Western Server”.  I then set up a round robin DNS entry pointing to those two IP addresses.

I opened the web page for the round robin name in the Chrome web browser, and got the page saying “Eastern Server”.  I then shut down the web server that hosts that page, and refreshed the page.  It instantly changed to a page showing “Western Server”.  Which is exactly what I want!  So I checked other web browsers, and every one I could easily check worked the same way:

  • Chrome 11 on Windows 7
  • Firefox 4.0 on Windows 7
  • Internet Explorer 8 on Windows 7
  • Opera 11 on Windows 7
  • Safari 5 on Windows 7
  • Internet Explorer 7 on Windows XP (after noticeable delay)
  • Firefox 4.0 on Windows XP (after noticeable delay)
  • Android native browser on Android 2.3.3
  • iPhone native browser on iOS 4.3.3
Except on Windows XP the refreshes after the server was shut down were apparently instantaneous.  Buoyed by this success, I tried lower level clients.  They worked, too!
  • curl on Windows 7
  • curl on Linux
  • wget on Linux
  • Python with urllib on Windows 7

Wow.  Maybe the operating systems were doing this, not the clients?  No.  wget was talkative, and reported that the connection attempt failed on one IP address and that it retried on another.  And Chrome’s developer tools Network tab showed the same thing: a request failing, and then being repeated with success the second time.  Also, I was able to find an HTTP aware client that did not work this way: Perl with LWP::Simple on Windows 7.

So my conclusion: round robin DNS is not certain to always cause a web browser to fail over successfully when one of the servers is down, but it is very likely to work.  If you want reliability over geographically separate server locations it seems like a good way to go.  When you discover a server is down you should fix it immediately or update your DNS to no longer point to it, but until that happens, most of your users will continue to be able to connect and use your site via one of the other servers.

[Update] When I got home from the office, I tested a few more web clients:

  • Logitech Revue Google TV
  • Chromebook CR-48
  • Samsung Galaxy Tab 10.1 running Honeycomb
  • Nintendo Wii
  • Amazon Kindle

It worked in every case but two. Make that every case but one and half. The Wii browser reported an unavailable page when refreshed with the previously displayed server down.  A second click on the refresh button did cause it to switch to the live server, which I call at least a partial success.  But the Kindle failed completely.  Turning off the server it had connected to and then refreshing the browser got a message about an unreachable page, no matter how many times I clicked the reload button.

So if you’ve got a mission critical web application that you offer through round robin DNS, be sure to tell your Wii users to hit refresh a second time if there’s a page failure.  And warn them to not rely on the Kindle’s web browser (which, to be fair, is still marked as an “Experimental” feature).

May 9, 2011

Handling Large Data

Filed under: Uncategorized — Charles Engelke @ 8:48 pm
Tags: , ,

This is my final session at Google IO Bootcamp this year, and the one I know least about going in.  We’re working with larger datasets and trying to get meaning from them, so there’s a lot of potential for us here.

This is the only session I’ve been in that wasn’t packed.  There are plenty of folks here, but there are empty seats and nobody sitting on the floors.  I’m sure people don’t find this as sexy as Android, Chrome, or App Engine.

We’re starting with the setup, which is pretty complicated.  We have to sign in as a special Google user, then join a group, then download and unpack some files, then go to some web pages…  And I’ve done all that.  Now I guess I’m ready.

We start with Google Storage for Developers, which is geared toward developers, not end users.  You can store unlimited objects within a whole lot of buckets, via a RESTful API.

We do an exercise where we create a text file, upload it, and make it readable to the world.  Then we fetch our neighbor’s file.

Next on to Big Query.  Which, for me, is a disaster.  Getting the tools set up and working is a mess under Windows, even with instructions.  And the meaning of the data we’re querying isn’t clear, making the exercise difficult.  But I got a few things to work.

Finally, we’ll use the Prediction API.  As for the exercises, I’ll try each one once then give up if it doesn’t work.  Messing with the installation and configuration takes my attention away from the actual tools.  Well, I think I’ve set it all up; it says it’s running.  I learned a lot of mechanics here, but don’t really understand what’s going on.  It should take about 5 minutes to do the prediction run I’m trying, and then I’ll see if I can make sense of the result.

Well, the result was “error” after 10 minutes of crunching.  I guess I’ll try it again, perhaps from a Linux box, someday.

That concludes IO Bootcamp this year.  All in all, it was well worth attending, even though I already knew some of the material.

« Previous PageNext Page »

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 47 other followers