Design Notes


Notes and thoughts on designing and building this little browser for a big world wide web. This is a working document -- design is fluid, and I reserve the right to change my mind, or at least think about it.
 
   

Links

Other J2ME Browsers

Downloads

Related but different

People who think about this kinda stuff

 
 

small sacrifice

Kinda goes without saying, of course that a browser that fits on your phone has to be small. Shrunq is small in terms of file size and tries to be as efficient as possible with limited memory resources and slower network speeds.

There is a balance of a few factors that can eat up memory while you're looking at a given page: 1) the page (text and images) that you're currently viewing, and 2) the storage of pages that you've previously viewed (cached history). A little background on both, and then a discussion about my "solution":

"chunk" blows

One thing I hate about browsers on handhelds and phones is that they often break up the page into numerous "chunks" so that you must constantly scroll and click to load the next part, next paragraph, or sometimes the next sentence. I must be some kind of neurotic prissy if clicking "more" 25 times in the course of reading a single 350 word article is annoying, right?

So to address my chunkaphobia, I've added a few features. First of all, when loading the page I try to load as much as possible, within the available memory constraints of the device. This means the page is as complete and long as it can be for your phone's available memory. And, I've added an "auto-scroll" feature that will automatically start rolling the page content up or down, with one key press,I've added a key that will take you down, screen by screen, to pass over the titles, navigation links, and other stuff on a page that you often have to pass through to get to the stuff you really want to read or see. This can happen while the page is still loading, which makes the experience of a slower network and download speeds a bit easier to bear.

Another feature to quiet my chunkaphobia is a feature I've awkwardly called "Skip To Content" -- which is a simple action that jumps you down the page to "meaningful" blocks of text. What do I mean by "meaningful"? Often at the start of a web page there are many short links and blurbs for navigation, so this feature analyzes content as it streams in, and identifies longer lines of text. Any click on "Skip to Content" jumps you down to the next spot with a longer line of text -- usually more "meaningful" content.

In trying to balance reading as much of a page in one download there has to be some kind of sacrifice with the amount of space available for caching, or temporarily storing, pages you've already seen. My approach is akin to my own short term memory -- try to hold to the last page you saw, but clear off anything older than that, if necessary, for the sake of the page you're currently loading. Consider it a zen-like focus on the here-and-now. This feature is configurable, so you can adjust for a small, medium, or large cache, or no cache at all. Shrunq does keep a "history", a list of all the URLs you've visited in this session, and you can always work your way back through this list and pages will show up immediately if the content is still cached.

the universe at hand


Another aspect of "small" is in terms of screen size, which raises all those concerns about how to squeeze a regular web page, often over 700 pixels wide, into a space between 85 and 130 pixels wide. Let's envision that math -- kind of like driving your Chevy Suburban through the supermarket aisle. Is it possible? Sure. Running Shrunq is not the same as your desktop running Firefox, nor can it be, but it's not as far-fetched as you might think, either.

All of which makes for numerous challenging questions, and finding the balance between what works within my "shrunq'd" screen and what a user needs or expects to see and expects to do the number one riddle to solve.

Perhaps the most important consideration, of course, is that HTML is not actually well suited for presenting content in a wide variety of devices. In fact any given site may run into issues in simply supporting numerous computers, operating systems or browsers. HTML is a mish-mash of content and code for layout, functionality and miscellaneous display. And while CSS and XHTML can address some of it, and WML (Wireless Markup Language) is meant to be specifically for handhelds and phones, the vast majority of web sites, both big and small, are built, supported, and maintained in HTML.

For many sites WML or mobile-specific content goes unattended or abandoned for long periods of time, or are simply unpleasantly unusable, or are so stripped down for the lowest common denominator that the very reason for your visit is removed. I have little faith that large, commercial media sites will convert content to mobile-focused output, even with a body like the W3C pushing for it. It's too much of a burden for the many personal pages, blogs, and grass-roots media sites to maintain multiple outputs (although the adoption of RSS by commercial media as a result of the use of RSS by Blogger or Movable Type surely shows a possible path).

Where does that leave us? We are stuck with a universe where HTML is king, and the tether to your desktop is a virtual stranglehold on the content we've grown accustomed. While we pack more and more computing power in our pocket and our world zooms forward with more solutions for faster wireless networks, from EV-DO to WiFi, location based services, and beyond, too much of that world of content lays chained to your desktop.

Therein lies the logic for building as full a browser as possible and practical for the phone: because that's the content I want to read and see. The content that is as much a part of my life as the physical and real, and necessarily as immediate as a phone call, information closer than the fingertips and more like memory.

wringing it out


Which brings us to an architectural question, which turns out to actually be <sarcasm>some kind of hidden political agenda</sarcasm>.

First the question: Since so much of the code in a regular HTML page ends up not being so useful or relevant for displaying content on the phone (horizontal layout, JavaScript, etc.), why not strip all that stuff on the server and just send down what matters -- text, the occasional image, and very basic layout? This would allow for better network performance and an easier task for coding on the phone, since the page would be pre-processed, with all the surprises and problems solved before it even hits the phone.

It's not a new idea -- in fact, that's what all the other browsers, listed on the left, do for their content. I'm not against doing whatever possible to support better network performance. But I do hesitate at the thought of a proxy server for content, as that adds another critical dependency to the experience, which would be (gulp) my server. This raises my role in supporting the browser from developing the app to providing an ongoing service -- which is a whole different kettle of fish in terms of cost structure (read: more expensive). I don't want the hassle of constantly stoking the coal under the server to keep it up and healthy. As the gateway for the entire internet, the proxy server plays a critical role for any usage of the browser at all -- if the proxy is slow or goes down, down goes the browser. Yikes.

Another aspect of this design is privacy. Lots of stuff on the web depends on cookies, both permanent cookies and session cookies. And while every web developer in the world should know that cookies are not very secure nor very private, there's stuff in there about who you are, where you've been, and what you've done. Using a content proxy, your cookies would also have to sent there, then forwarded on to the right domain.

Uhh, hold on there. Which means, by design, that cookies would be sent to a server that didn't issue it (the content proxy). While the responsibility for implementing that properly falls squarely upon the developer's shoulders (who, me?), as a user I want complete confidence in how cookies are being managed, sent and received. Sure, there are "regular" proxy servers all over the place, and the real risk of losing anything meaningful from a cookie through a poor implementation is limited -- and I'm not risk averse: I eat meat, I stick pen caps in my ear, just for starters. But for some reason, I get a little nervous about what kind of crap those cookies are shooting all around the web about me, and if there's a chance that they go to the wrong place by design, or out of necessity, it's enough to make me whisper "Mad Cow!".

Let's add a few more points about privacy. The content proxy I described will first load the page you request, parse out what it deems unimportant, then send it on down to you. As an end-user I'm uncomfortable with this at both points: I don't want the content proxy having to know about, or possibly track, the pages I see, and I don't want someone or something cutting out stuff that it deems unnecessary. I understand this may seem a bit paranoid, but I feel ill at ease about the state of information security and privacy on the desktop -- why introduce, at the network level, the opportunity to make things less secure or less private? And, of course, the manner in which I have chosen to implement client-side parsing and document construction is a de-facto implementation of "cutting out stuff" from the source document. But I'd rather live with that than alter the architecture, at the expense of security and privacy, introducing higher costs, for what feels like a programming shortcut (i.e. server-side pre-parsing).

compromise or hypocrisy?


So, enough of the quasi-political agenda. I'm not some crazy libertarian who is building a browser while the smell of burning hair under my tin-foil hat makes me squint to see "the man" intercepting my browser history. It's actually simpler than that: I'm cheap -- I don't want to pay for a browser, and want to build a browser where I don't have to charge for it (by hosting a proxy). And, I enjoy a programming challenge: I like the task of building a full-strength, streaming HTML parser for the phone.

And, as if to prove it's not political issue, I'll introduce some hypocrisy. The images that you see in Shrunq: through a proxy server. "What?" you might ask. "After all of that crap about privacy?"

Well, it turns out that Java on the phone only supports one kind of image -- the PNG (portable network graphics format). It's a fine format, but a newer format, and unfortunately most pages are filled with GIFs and JPEGs. Instead of only loading PNGs, which would pretty much eliminate just about all web graphics in Shrunq, I have to convert these images to PNG format so you can see them, and alas, that's just too math and processor intensive to do on the phone. A web browser without images is, well, pretty lame, as WML browsers can attest, and still not fast or geeky enough like Lynx to warrant consideration.

Now, if I scared you enough with my earlier discussion about privacy or keeping costs down, rest assured I'm not caching any images on my server -- it processes it when you request it. It might take a little longer, sure, but there's no "trail" of the images you're viewing anywhere. In terms of cost or network reliability, if there is any added dependancy it won't ever keep you from getting to the sites you want to see. If I reboot my server I won't lock you out of the internet, but you may not load the next graphic.

Still nervous? Okay, well, you can turn off the images and not load any at all. That'll show 'em. You might want to do that if your network speed sucks, anyway. You can also set color preference from full color to black and white. And, if there's anyone willing, I'd be happy to share the image proxy source code and provide users the ability to switch as needed, or even host your own image proxy. Does that take care of it? I hope so. This tinfoil on my forehead is getting sweaty.

a game of tags


The laundry list of feature support I've included are practical ones that support human comprehension of HTML-based content: things like images, blockquotes, simple formatting with bold or italic, and support for basic characters like ©, ™ and the like. I've also implemented navigational tage like links, and reading server and client-side redirects. I have yet to implement the blink tag, but maybe, dammit, I will. It's my browser.

Functional matters like form support are harder to implement for a variety of reasons, not the least of which is no clear standard for display. Forms were a required feature for Shrunq, in my mind, so that you can log on to sites that require registration. There are many web forms Shrunq plainly barfs on, due to complexity or inconsistency in the HTML or my inability to deal with said inconsistencies. But it does fine on the important ones: Google and Flickr. I would not recommend trying to order stuff on Amazon with it, bother using web based email (like HotMail or Yahoo!), nor fill out sign-up forms for newspaper access (although bugmenot seems pretty solid), but Shrunq should be able to take care of your basic site login and search query needs.

Did I mention it works well on blogs? Probably partly because your average blog is a clean three column layout, and partly because the blog engines provide and enforce some pretty clean code. Shrunq has some basic RSS support, so you can even give that a try (although I haven't solved the use of displaying either description or content:encoded, or the inherant messiness of CDATA). Whatever -- it's there if you feel the need. I've never been that much of a fan of the RSS feed -- the rest of the site context, the images, not to mention the full entry are things I prefer to view on a given site, instead of within a feed reader.