Contact: (801) 853-8339 or jesse@staynalive.com
Twitter, FriendFeed, LinkedIn, or Facebook

Automatic Data Compression With DBIx::Class::CompressColumns

Hi - you seem to be new here. If you like what you see, please give back by subscribing to my RSS feed!

You can check me out on Twitter, Facebook, or FriendFeed to see what I'm up to. Thanks for visiting!

I also consult, and am open to full or part-time work. If you are interested, please contact me - check out our services at http://staynalive.com/consulting

UPDATE: You can now get the DBIx::Class::CompressColumns module on CPAN here or via CPAN command-line shell.

Too Many PeopleI’m going to get geeky on you for a minute, but you should find this interesting.  One of the challenges I’ve had with SocialToo recently has been the massive Social Graph data we’ve had to story and process and track. We cache a lot of the data so we don’t have to hit Twitter’s servers as often, and also to enable us to track new follows and unfollows regularly on behalf of our users.

If you are a SocialToo user you may have noticed that your data hasn’t been as accurate lately as it should.  The reason for that is we have had a) 20,000+ users all wanting to auto-follow or have their follower base tracked, and b) all 20,000+ of those users have anywhere from 100 to near 1 million followers that we have to store and process.  It’s not an easy task!  And our database, set up in a relational manner of followers to users, just wasn’t cutting it in regards to being able to retrieve and process so many followers at a time.

So I took a cue from Bret Taylor and FriendFeed, who talks about how they denormalized their database, and now reference “bags” of data that they can then process in their code.  I went for a hybrid model, and with each user entry I now have a single column on that table we reference, in BLOB format, which contains all the social graph data for that user.  In Perl, I simply create a hash structure of the data, freeze it, and then store it in the database in our social graph column.  To retrieve it, we pull it from the database, thaw it, and we have an entire social graph we can play around with and do with as we please.

The issue I was running into however is that plain text, stored in a single column, for a user with 1 million followers, gets to be quite a large amount of data we need to pull through the pipes.  I needed an easy way to compress the data before inserting into the database, storing it in binary format, and decompress.  I also wanted it to be automatic, so no coder would ever have to worry about this extra step - it would just happen magically.

So today I’m releasing DBIx::Class::CompressColumns for all you Perl coders out there.  What this module does is it sits on top of Perl’s DBIx::Class database abstraction libraries and allows you to monitor a single column.  Any inserts or updates into that column get compressed in Zlib format, and any selects/get_column calls to that data (you must use get_column) get de-compressed, meaning you don’t have to worry at all about that extra step, the data is a significantly smaller footprint, and your throughput is much less, causing much less load on the database.  For one-million followers, I measured just 4 Megabytes in space taken that has to go in and out of MySQL.

Approaching Graph optimization in this manner has significantly sped up our processes, and I’m already seeing huge benefits from it.  There is much less load on the database, it’s much faster to retrieve and process the data, and we’re getting through our users’ followers much faster now.

The module namespace is currently being applied for on CPAN at the moment, and I’ll post a link there as soon as it is approved, but for now you can download the Makefile-compatible gzipped library here.  I hope some of you find this useful, and please feel free to modify or send me any updates or bugs you think I missed!

The link for the download is http://socialtoo.com/DBIx-Class-CompressColumns-0.01000.tar.gz

Oh, and TMTOWTDI so please if you have better ways of approaching this I’d love to hear your ideas!

Photo courtesy rp72

Facebook Announces F8 In the Middle of OSCON, Coincidence?

l11204705797_2531.pngJust yesterday, Facebook announced their second F8 conference, to occur July 23, 2008. This Developer-targeted event is said to possibly include some major announcements, including the new Profile redesign, more information about the fbOpen platform, and most significantly, possibly the launch of their E-Commerce platform. What hasn’t been announced or shared however is the odd timing of the event.

The event occurs right smack dab in the middle of O’Reilly’s Open Source Convention, scheduled to occur for about the past year now from July 21 through July 25. This conference is known as an essential “Mecca” for Open Source developers around the globe, and has presentations from such players as Google, MySQL, Sun, Meebo, and even SixApart. Everyone who is a developer (unless you solely develop for Microsoft) or Sysadmin will be at this conference.

As a developer, this is tough news to hear that Facebook will make me choose between OSCON and them. Frankly, I would by default choose OSCON if I were any smart developer, as I would get more. So why isn’t Facebook just joining OSCON and doing an “F8″ track there? Do they really want to tick off Open Source developers? You better bet that OpenSocial will have a presence there. If Facebook really wanted to target the Open Source crowd, as they have “claimed” to do with their fbOpen Platform and a few other contributions back to the community, they would try to have a presence at this conference and not interrupt it as they are currently doing. I was actually going to go to OSCON to promote my FBML Essentials book to potential Facebook developers for O’Reilly. Now I’m forced with a decision. I’ve contacted Facebook with no response, and I’m getting a little frustrated as a Social Media developer. Which conference will you choose?

New Series: Social Coding

I’ve been contemplating for awhile now a good way to share what I know about Social Software Development and helping business owners, marketers, and developers learn how to set up their own social apps. Especially for developers, I know there are many out there looking for howtos and ways to learn more about starting their own App, promoting it, and getting it off the ground. As the author of FBML Essentials, I feel I am well suited for the task so in the next few days I’m going to start doing howtos and overviews on how you can get your own Apps together. If you’re “the business type”, I may get a little technical on you, but I do recommend you keep watching and forward these onto your IT personell - your CIO, CTO, and the like should read these so they can learn what’s possible to integrate into your existing environments. I’ll also try to throw in a little goodie here and there for “the business type”.

So, I’ve created a new category to the right, “Social Coding” - if you want to track just that, click on the category name and add it to your RSS. I’ve also started a new FriendFeed Room where those involved or that want to get involved in Social Coding can discuss, learn, and talk with each other. You can subscribe to that here.

Let’s start by going over the types of sites I could cover. Here are just a few - let me know if you have a particular interest in learning about how to code for any one in particular:

  • Facebook
  • OpenSocial
  • Google Friend Connect
  • Twitter
  • FriendFeed
  • Pligg
  • Digg
  • LinkedIn
  • MySpace
  • Wordpress
  • MoveableType
  • Google App Engine
  • Bungee Connect

Stay tuned! I’ll keep posting news and other rants as we go forward - I’ll just be adding in some good howtos at the same time. Oh, and if you’re a developer and would like to do a howto in your preferred language for us, contact me - I’d love to let you do a guest post.

Utah Startup Series: Bungee Labs

logo_bungeelabs-flat_md.png(Sorry it’s been awhile since my last blog - it took me several days to figure out how to get my Flip video imported and exported to and from iMovie. To make a long story short, if you want to export from iMovie and have both picture and sound, you must import your source as something other than MP4 or AVI.)

This is the first article in my “Utah Startup Series“. Starting today I will be circling Utah to find the best and most innovative startups in Utah, and featuring them here on Stay N’ Alive. If you have a hot startup (early to even late stage) and would like to demo for me what your product can do, please contact me - if I have the time and like your idea I’d love to come out and take a look at it!

While at Web 2.0 Expo I had the opportunity to meet with Bungee Labs, a local, well funded Utah company who had “Platform as a Service” down before Google even started thinking about their App Engine. In our meeting they demoed their Bungee Connect “IDE” (written entirely on the web). You can see the video below.

My thoughts - you have to see this stuff in person to understand the full ramifications of what they’re doing. One of the cool things about their service vs. Google’s is they actually integrate with Amazon’s EC2 service (which was announced during Web 2.0 Expo), so you can actually host your other stuff on Amazon’s EC2 platform with the same licensing as your Bungee Connect account. Their licensing structure is very appealing as well - Bungee only charges based on the number of registered user sessions using their platform, not traffic, not bandwidth. If I understand correctly, it’s all based on the number of users actively using your application on their platform. For Facebook and Social Media developers this is appealing, as most Applications are rated based on Application use, not number of users or traffic. With Bungee you only pay for the users that actively use your system.

Overall, the guys at Bungee were Rockstars at Web 2.0 Expo. With their announcements about EC2 integration, flexible licensing terms, features on TechCrunch, EWeek magazine, and a dozen other publications, you can bet Google has a watchful eye on them. Ironically, it was interesting seeing Kevin Marx, head guy over the OpenSocial (and other) efforts at their party on Thursday evening.

Bungee will be presenting at our Social Media Developers meeting this coming Tuesday, showing us a simple “Hello World” example on how to build a Facebook App using their platform. Follow me on Twitter and if we can stream it live you can watch it via my Ustream channel. After demo I may just write my own Facebook App to try out their system - it should be interesting.


Bungee Connect Demo - Web 2.0 Expo from Jesse Stay on Vimeo.

Why I Hate the Twitter Syntax

history76156-thumb.pngI have disliked the Twitter syntax since I’ve been on it (you can find me via @JesseStay on Twitter - go ahead and follow me!). As a long-time IRC user, everything seems backwards! I have often referred to Twitter as “IRC 2.0″. I’m not sure I can fully embrace that concept though.

For those unfamiliar with IRC, it predates even instant messaging. It brought out the original concept of a “chatroom”, and exists even today on various servers throughout the world. Ustream.tv currently uses it for its users’ channel chatrooms. It is the home for almost any “live” activity of any open source project (log into irc.freenode.net to see - I’m often in #utah there, as well as recently #codeaway). Traditions have been established, and virtual friendships have been bonded. In many ways it could have been the original concept of a “social network”, the first concept of linking friends together in a single place on the internet.

I was at a Perl conference just last year, and was happy to see the #YAPC chatroom in irc.perl.org open during the banquet. We had a ton of fun with that! Now, just this year, when I go to conferences, I see speakers leaving up Twitter, and answering questions via Twitter. The two seem to be serving similar purposes, in different ways.

That’s why I was astonished when I got on Twitter for the first time, and started seeing public messages directed to individuals with “@” signs in front of them! Is there a source for that that I’m not aware of? I know of no known documentation that Twitter themselves created to establish that tradition. In IRC you simply type “username:”, and then your message, and it gets highlighted in that user’s chat window in most IRC clients. Better yet, I can start typing the username and it tab-completes. You can’t do that in Twitter. That tradition and method has been around for years, yet Twitter seems to break the mold for some reason.

IRC also supports commands - I can type “/nick newnickname”, and it switches my username, automatically! It’s a basic standard that all clients support, open, and available for all to use. Twitter I have to go entirely to their website to do anything, and it’s extremely limited in what you can do. To direct message someone on Twitter, I have to type, “dm username message”. In IRC it’s just a simple command, like all other commands, and I can always type, “/help” if I don’t know what the commands available are. I simply type, “/msg username message”, and it messages the user, and again, it tab-completes the username!

Why couldn’t Twitter just use the IRC standard in their platform, and then expand upon it to improve the IRC standard and bring it to a mobile world? By all means many of their scalability issues may have been taken care of had they done so. Not just that, but they would now be able to support groups, and less development would be needed to manage their platform. Twitter says they have an open API - I question that openness. It’s not based on much of an open standard, and IMO, it’s causing them problems now because of it.

Looking to start a project? Always look at the open solutions that are out there first, then build upon them - you’ll have much fewer headaches if you do.

(Photo courtesy GapingVoid.com)

Using Perl/Catalyst and Want to Use Sometrics? Try This.

logo.pngI’ve been analyzing various Social Applications Analytics tools lately, and have recently stumbled upon Sometrics. Sometrics handles full Analytics for your Facebook, Bebo, and MySpace applications, and will actually utilize the Facebook API to retrieve demographic info about those visiting your Application. As I examine the other Analytics solutions for Facebook and other Social Network Applications, I’ll try to post my findings of their strengths and weaknesses here, OpenSocialNow, and FacebookAdvice.com. If you’re not a techie, you may want to skip the next part, or forward it onto your IT department.

One thing I noticed about Sometrics is it seems to only provide code to paste on your Application pages for PHP, Ruby, and ASP.net. The code they provide is relatively simple, but in case you’re wondering how to do it in Perl, here is how I did it in Template Toolkit under Catalyst on Perl:

Enter this on all Application pages (I do it in my “footer” file):


[% IF Catalyst.request.param("installed") %]

<fb:iframe width=’1′ height=’0′ frameborder=’0′ src=”http://halo.sometrics.com/fb_tracer.html?src=fb&installed=1&session=%7B%22session_key%22%3A%22[% Catalyst.request.param("fb_sig_session_key") %]%22%2C%22uid%22%3A[% Catalyst.request.param("fb_sig_user") %]%2C%22expires%22%3A0%2C%22secret%22%3A%22%22%7D&t=[% date.now %]“></fb:iframe>

[% ELSE %]

<fb:iframe width=’1′ height=’0′ frameborder=’0′ src=”http://halo.sometrics.com/fb_tracer.html?src=fb&session=%7B%22session_key%22%3A%22[% Catalyst.request.param("fb_sig_session_key") %]%22%2C%22uid%22%3A[% Catalyst.request.param("fb_sig_user") %]%2C%22expires%22%3A0%2C%22secret%22%3A%22%22%7D&t=[% date.now %]“></fb:iframe>

[% END %]

Then add this in the “post-remove url” subroutine for your Applicaiton (or create one and add the URL in your App’s config):

=head2 remove

  Page that handles App removal

=cut

sub remove : Local {

  my ( $self, $c ) = @_;

  if ($c->req->param(”fb_sig_uninstall”)) {

    $c->res->redirect(qq{http://halo.sometrics.com/met.gif?a=u&app=}.$c->req->param(”fb_sig_api_key”).qq{&uid=}.$c->req->param(”fb_sig_user”).qq{&age=&sex=&city=&state=&country=&friend=&src=fb});

    $c->detach();

  }

  return;

}

Who Said Perl is Dead?

perl.pngI’ve been following the issue list for Google App Engine (just realized it doesn’t have an “s” in the official name), and the two top issues are a dead heat between Perl and Ruby in the requests to have Ruby or Perl support. Ruby, as of this writing is at 361 votes, and Perl is right on it’s tail at 347 votes. Perl until a few hours ago was pretty far ahead of Ruby. PHP is only at 70 votes, and Java is at 247 votes.

Does this mean Perl is making a comeback? Did we ever really leave Perl? As an avid Perl developer this makes me happy, as Perl can do anything Ruby or even Rails can do, and even more (Perl XS and tie-ins to C are very powerful!). All of my current Facebook Apps and OpenSocial Apps I do in Perl on an MVC Framework called Catalyst - it’s very scalable! It never made sense to me when people said that “Perl was Dead”. Is this just a reflection of the type of Audience Google supports, or is it reflective of what new media developers are actually developing in?

I’m hesitant in posting this, as it could bring more Ruby voters to the mix, but hey, let’s keep it fair. If you want to vote for Perl, click on the star here. If you want to vote for Ruby, click on the star here. Not a developer of either? Then you’re on your own. :-P

I wonder how Python would fare if it got equal treatment.

UPDATE: Within just a day after this post things have gone back to how I would expect them to be. Java has a strong lead over all the others, followed by PHP, then Ruby, then Perl. Perhaps the issues just needed a little exposure. Based on the interest, Perl is still far from dead though.

Google Announces “Google Apps Engine”

google_appengine.pngOkay, so I was wrong - it was worth a try. I do still expect more large announcements related to Social Media from Google. Just recently, Google announced their “Google Apps Engine” (will it be nicknamed, “GAE”?). It is essentially a competitor with Amazon’s EC2, S3, and SimpleDB, but at a much higher level. You’ll be required to interface with the service via the Python Programming language at first, but it is intended to make scalability and server set up much easier. Google does say that the underlying infrastructure is entirely language neutral, so we should expect more languages in the future. The advantage over Amazon is Google takes care of all the server set up for you - this is essential for a small business that can’t afford to hire an expensive Linux Admin as Amazon requires.

The Service is only available to the first 10,000 developers that apply at http://code.google.com/appengine/, and will be available starting at 9pm PST tonight. You can read more at Venturebeat and TechCrunch here and here.

Well Done Guy! Chris DeVore is a Cheapskate

I just caught this article from Mashable and I just had to pipe in. In the article, Mashable’s Kristen Nicole claims Guy Kawasaki paid too much for the development of AllTop, at $10,000. They compare it to Askablogr.com, claiming Chris DeVore only paid $7500 for the development of Askablogr, with its rich feature-set.

I was blown away by this! Not that Guy Kawasaki paid $10,000, but that Chris DeVore only paid $7500 for Askablogr. Now, I don’t know Chris, so take this with a grain of salt, but some call it a deal. I say he’s a cheapskate! For something that will be your primary revenue source and your main line of business, $10,000 for something like Alltop.com is a steal! The fact that Chris DeVore only paid $7500 for his development means he’s either hiring offshore, doing the development himself (in which those costs are way under-inflated), or he’s very much underpaying a bunch of gullible developers that probably don’t believe much in the product they’re working on.

As a business owner, when supporting a technology-based business, it is of utmost importance that you put your developers and IT staff at first priority. They are your bottom-line, and should be the superstars of your business. You have to keep in mind that for top notch developers and technology, you’re competing with the likes of Google, Facebook, Yahoo, and others to get the best talent. By not paying your developers, you will either a) lose your developers very quickly, b) have a revolution at one time in your future and your developers will all back out on you in rapid succession, or c) not get the best work and skills you could be getting, and you’ll definitely run into scalability issues as your site grows in the future.

I recently finished the book, “My Startup Life“, by Ben Casnochas. I bet Guy’s read it and Chris hasn’t. In it, Casnochas talks about the lessons he learned by not paying his lead developer well. He quickly had threats of the staff to leave, and they quickly ran into scalability issues due to the unexperienced offshores they were hiring overseas. In building a technology-based business it is of utmost importance that you pay and treat your IT staff well or it will come back to bite you in the future.

So, Kristen, I say Guy is the smart one in this case. I am willing to bet his site scales better, his developers are happier, and more likely to work with him in the future. Guy’s likely to get millions for Alltop.com in the future, should it succeed, so $10,000 is a very small price to pay to get good developers on staff.

UPDATE: See Chris’s comment here: http://staynalive.com/articles/2008/03/21/well-done-guy-chris-devore-is-a-cheapskate/#comment-2126. I probably inappropriately labeled Chris a cheapskate while trying to defend Guy. It turns out (and I should point out, unless I read it wrong, that the Mashable article did not make this very clear either) that Chris’s project was a project built simply to point out how cheap something could be developed. In that case it would make him an intentional cheapskate, not that there’s anything wrong with that. As I mentioned, I’m a cheapskate too - I just don’t see the reason to short projects in development costs when it is the core to the business. It is an interesting experiment regardless. Thanks for visiting Chris!

Twitter Opens Their Messaging Platform

Today, in the first post on the new Twitter Technology Blog, Alex Payne announced that Twitter is releasing their underlying messaging platform, which they call, “Starling”, to the community. From the announcement it appears Starling is the basis for handling all communication underneath Twitter, speaks memcached, and reminds me in some ways of Perl POE, for Ruby. This is the development baby of Twitter, a great move by the new head of Engineering for Twitter, and a great benefit to the development community! Twitter is starting to remind me very much of Google in its philosophies, starting with a core technology, focusing on that, then figuring out monetization after the fact, all while giving back to the community. Way to go Twitter!