DISQUS

Urbantastic Blog: Tech Tuesday: The Fiddly Bits

  • mattrepl · 10 months ago
    If you haven't yet, you may want to check out Tokyo Cabinet. Out of the promising, open source BigTable clones (Cassandra, Hypertable, Tokyo Cabinet), it appears to lead for the sort of performance you'd want to back a web app and is the maturest.
    _
    Tokyo Cabinet - http://tokyocabinet.sourceforge.net/index.html
  • Me · 10 months ago
    How is Tokyo Cabinet in any way a "BigTable" clone? It's a basic file based key/value store, and with Tyrant you can access it remotely. It was the follow on to GDBM. It could be used as the back end for a distributed system, but it is a lower level system than BigTable.
  • mattrepl · 10 months ago
    "BigTable clone" is too fuzzy a term. I meant a distributed, scalable storage database for structured data [1].

    Tokyo Cabinet is certainly lower level than BigTable, but I don't know how else to refer to the Tokyo Cabinet stack that mixi.jp (essentially the Japanese version of Facebook) uses. It includes Tokyo Cabinet, Tokyo Tyrant, and Tokyo Dystopia.

    The exposed scaffolding and simple design make managing a TC-based database less scary than other more opaque designs.

    If you haven't checked it out yet, their presentation slides [2] might inspire you to see how TC could be used as a BigTable stand-in for web applications.

    _
    1 - To make things even clearer I'll add: the database is not limited to row-oriented storage, lacks a fixed schema, and supports column indexes.

    2 - http://tokyocabinet.sourceforge.net/tokyoproduc...
  • heathjohns · 10 months ago
    Thanks for the link... I've been meaning to check it out - it means a lot to hear that there's a high-volume site that uses it.
  • Paul · 10 months ago
    Don't you find that as you migrate business logic to the client side to reduce latency, you end up duplicating code because the server must validate the client's assertions and requests for security reasons? I keep running into this an this and thinking that we need better ways for running the same code on client and server.
  • heathjohns · 10 months ago
    Yes, it's definitely a factor. I typically put all of the checks on the back end first, and then duplicate some of them on the client. The ones I duplicate are usually just for user experience's sake, and so end up being slightly different anyway.

    There's an effort called ClojureScript that might help with the code duplication. But as it stands, though, I actually like having to two separate checkpoints in place. I've caught some of my own mistakes this way, and in the worst case scenario are far more likely to give a false negative than a positive, which is a better failure state than an inconsistant db.
  • granz · 10 months ago
    I love this direction and have much in common. I've been using non relational databases for a decade (Lotus Notes) and I really can't stand being stuck in a framework for web design (most of the time). I've also just started getting into lisp for the same reason...Paul Graham's essays!
  • pvaillant · 10 months ago
    I have to say that I like hearing people describe similar architectures that I've been considering. The move to greater client side computation and the scalability that static HTML combined with JSON data is awesome. The one problem that I've never been able to over come is the lost of search engine exposure and SEO. I'd love to hear your thoughts on SEO with this style of architecture.
  • heathjohns · 10 months ago
    It's still being written, but there's a front end which will cover all the minority users: IE6, mobile, screen readers, web spiders etc. It will be very simple, but fully functional.

    In our particular case we're not interested in having Google index the dynamic content of our site - we have much more in common with Gmail than, say, a CMS system.
  • Anoush · 10 months ago
    Heath,
    I love what you are doing. I dont know if ben told ya I work in a data center. The job you are doing is greate because running data center sand having powerfull servers are expensive. cost are going up in terms of colocation because of power so pushing some of the job to client side will fix alot of the issues of the problems.

    Anoush
  • heathjohns · 10 months ago
    Hey Anoush!
  • Alexis Smirnov · 10 months ago
    Thanks for the write up. The principles you describe "How things fit together" is completely applicable to other choices of the back-end technology.

    "The site originally ran on Google App Engine, which in theory should have been perfect - why we left them is a whole other blog post."

    Subscribing and looking forward to it.
  • essiene · 10 months ago
    Interestingly, I have been going down a similar path, only using Erlang/Mnesia/Mochiweb/JSON/Python.

    In my case, I differentiate b/w the ApplicationServer (Erlang/Mochiweb) that handles the actual logic of my applications. I call to the appserver via REST and get JSON replies.

    I then have a UI Server, which is built with a Python framework, currently Pylons/Simplweb. This actually just serves up what I call UI Apps, which are basically orthogonal applications very heavy in Javascript.

    So in a typical scenario, when you hit the home page for my application, you'll hit the UI server which will return the Login page to you. This page is actually fully Javascripted and when you hit submit button, the Javascript will make a REST call to the appserver to authenticate you. If the JSON sent back indicates success, we use document.location to redirect you to the next UI app in the sequence. Each UI app is almost independent of the other, and so forth.

    I've been distilling this architechture slowly over the past 4 months, and I have some unanswered questions in my head, but also, I have some very cool gems, like the way I use contracts b/w the appserver and the UI apps. Each JSON response is actually a structure containing, version info, type info, and some other parameters, so my Javascript processing them begins to closely resemble pattern matching in Erlang.

    All in all, like you say... its interesting, its new, but there are lots of kinks to work out in the way. I like the way you make all your HTML static, while I still have some bits supplied by templates, I think your approach is superior in this regard, as it has less moving parts.

    Someone should give this method a name and actually start studying it.
  • Drekar · 10 months ago
    Your observation that the fundamental context has changed is dead-on. I am doing something much akin myself, and have luckily come to the same conclusions. That's a good sign! Keep up posting, I wish I took the time to make a blog...
  • Patrick S · 10 months ago
    I find your choice of Clojure for the web side interesting, especially w/o using compojure. Partly because I went away from Clojure for the presentation layer for that very reason. Instead I'm looking at doing my back end processing using it (for the increased speed) and my front end using Python w/Django. Will be very interesting to see how your method works by comparison.
  • Andres · 10 months ago
    And how exactly is clojure used? Are you using any particular server or lib?
  • heathjohns · 10 months ago
    I'm running Nginx which serves up all the static content, and also forwards certain requests (based on the url) to a different port. On that port I'm running Clojure, which uses the Jetty library to present an HTTP interface.

    That Clojure server is where the "business" logic resides, and it, in turn, sits on top the CouchDB database (again, connected via HTTP).
  • Al · 10 months ago
    Your idea is showing a consistent approach, both microscopic (choice of language) as well as macroscopic (client-server architecture).
    Does it carry some risk? You bet! but I think placing yourself ahead of the curve will pay handsomely soon.
  • Mark Watson · 10 months ago
    I browsed thru Urbantastic, and I must say i'm quite impressed; the pages load in a snap. You have some very interesting ideas with LISP, CouchDB, etc.

    One thing I noticed however, and I think it's kind of a downside to separating out the dynamic and static sections, is that when one turns off javascript on all the content goes completely away. My impression is that it's cool if an email client requires javascript to work, but having an informational site do this completely breaks the linked document metaphor of the web. I would be really interested to know how it's even possible for search engines to index this... But maybe I'm just being too closed minded about the whole thing. :)

    *Deep breath* You really should blog about why Google App Engine didn't work out for you, I'd love to hear about it. The reason I ask is because I've been getting into using it, and it seems almost too perfect and awesome to be true...
  • heathjohns · 10 months ago
    Right now there's no Javascript-less option. I'm working on a front end to fix this, however. This will cover IE6, mobile, screen readers, web spiders, etc. It's going to be very simple but fully functional. Think the HTML-only option for Gmail.

    However, you are right about the document model. Urbantastic is very much an application, not a CMS system. Just as it doesn't make sense to index Gmail, it wouldn't make sense to index us.
  • nimai · 10 months ago
    I'm curious to hear how your stack is working out for you. It's definitely interesting technology, I guess in a years time when you've got a whole heap of code to maintain, it'd be really good if you write up a "in hindsight" blog where you talk about the things that have worked and haven't
  • heathjohns · 10 months ago
    Yes. I've talked about what I've implemented, but I won't specifically advocate it until it's seen some road-miles.

    The site's only a little over a month old at this point...
  • Giles Bowkett · 10 months ago
    Can you release parts of this system as open source? I'd like to take a deeper look at it, and experiment with it.
  • heathjohns · 10 months ago
    The nice thing about being a non-profit is that there's no argument against open sourcing. I intend to release the whole site under an OSS license when things settle out a bit.
  • Jason Dusek · 10 months ago
    I would be very interested to know more about why you left GAE.
  • Ryan Christensen · 10 months ago
    Would love to hear more about the google appengine change and more on your work with couchdb and why. I have apps there running and have probably ran into some of the same limitations that take some thought (i.e. searchable text, deep results paging etc). I have been working on a framework very similar to what you mention with rest services as the database for most elements that can be easily scaled on any platform (or combinations). The one thing that pushed this way was economics as well, the more requests are done on client machines to other endpoints the more you can horizontally scale and the less inter-request connectivity you need with HTTP based services/data.
  • Ben · 10 months ago
    Hi Heath - greetings from Mount Pleasant!

    Your stack sounds good. I use a similar Json->JS approach where it makes sense. I'm using GAE a lot and have been very happy with it - 100k req/sec at my busiest time daily. I'm curious to know what problems you encountered.

    Ben
  • Bob Follek · 10 months ago
    I"m curious - did you try compojure and run into problems? Or did you just get a sense of "not ready for prime time?"
  • John · 10 months ago
    So, are you not interested in being indexed by Google? Or is the solution you are working on for mobile browsers etc going to provide SE friendly content as well? if so, any plans for avoiding getting flagged for cloaking?
  • digash · 10 months ago
    Since your highest risk is CouchDB, have you looked at the competitor MongoDB (http://www.mongodb.org/)?
    It is also quite new, but already have some application using it in prod and very simple to use.
  • Vincent Murphy · 9 months ago
  • Vincent Murphy · 9 months ago
  • chlai88 · 8 months ago
    You know what, I've come to exactly the same conclusions as you do as a soloist. Absolutely agree with frameworks giving u a breezy 90% with the remaining horrifying 10%. And yes, it's very important also to like what we are using, even if it sounds like a lot of work, the like part sort of makes up for it. I believe eventual simplification in web development will lie more & more in new DSL languages tuned for the web rather than bet the framework path to death. Will take a look at Compojure/Clojure when I find time :)
  • Jack · 10 months ago
    Utter nonsense. You'll likely fail.
  • DZ · 10 months ago
    But even in failing, he'll likely have done more than you'll ever do.
  • Patrick S · 10 months ago
    If you don't have failures along the way you likely won't ever succeed. You have to do risks to pull off anything really great and/or interesting.
  • Box · 10 months ago
    Did you get hit by a bus? Awesome ad campaign by the way.