The Null Terminator
Ethan Ram’s geeky blog on the seam of technology and product management.
Monthly Archives: August 2011
Web analytics and BI reporting services review: Google Analytics and SiSense Prism,
2011 Aug 30Posted by on
A Review of Services I’ve used in GameGround – Part III – Marketing tools
This is the 3rd part in a series of blog posts reviewing several 3rd pty products and services I’ve used in GameGround and my take on them. The basic approach I’m taking here is the applicability of the product for a lean-startup that wants to move fast. In the last post I wrote about Community engagement tools for the marketing team: sending emails and engaging customers in a conversation. This post is about Analytics and BI Reporting. Next up – OPS tools and of course, development infrastructure.
This extremely popular free SAAS service by Google has become the de-facto standard in website traffic analysis. 10 Years ago I used to download my http server logs and run a simple analysis tool that gave me most of the basic analysis features I needed, but this SAAS has some excellent analytics features like measuring page view time, campaign origin tracking, goal tracking, integration with AdSense etc. There are a few BUTs here, which make me think twice before I choose this option again:
- You want analytics not only about page views. You want to know about downloads ppl have made, about clicks to the cancel button on the registration page etc. All that is not well supported with Google Analytics APIs. The Events API does not work with the goals feature and calling the APIs on page events (.g. clicks) reporting fake URLs is skewing everything and the numbers just don’t add up.
- The tool is mostly good for the product managers to see how the crowds are using the product/website. A few other causes are not well served: your sales manager would like to get information of sales progress from analytics on individual potential customer – integrate the analytics goals with SalesForce and alert sales staff to contact potential customers. Engineering would like to know of faulty pages (500s), broken links (404s) etc. Security officers would like to know of traffic spikes and login errors to track potential invaders and hackers. You’ll need other tools for those tasks. p.s. check out the friends at Totango for an excellent analytics tool specific for SAAS sales managers.
- Much of the analytics data is delayed. You get much of your statistical views update daily. So you upload a change close to the end of your working day. You come in the morning – still no significant results. You have to wait another day before you get some results. And that is IF you’ve managed to write the fake URLs thing correctly. In many cases you’ll need to fix it a bit and repeat the test. This is way too slow for a startup…
- GA is a basic website traffic analysis tool. Traffic alone is good for SEO tasks, understanding traffic sources, goal achievement etc. For Business Interlace you’ll need something much stronger with access to the business data stored in your database (see below).
In short – For modern websites and apps GA is almost useless – it will only give you the big picture. Forget about the details …Or check out a better service that was designed for it.
Two insights on the development management side of things: Plan the analytics of every feature as part of the design of the feature itself. Having a feature that one cannot analyze and understand user interaction with is usually worthless. Plan to spend more time than you initially thought to support Google Analytics efforts (probably true with any analytics.)
SiSense is a startup developing a very interesting reporting product that is based on unique Columnar Data Storage technology (as opposed to the “regular” OLAP-cubes or other in-memory solutions) that enables large-scale data-sets analysis. The product has an easy-to-use interface that allows creating of beautiful web-based reports for business intelligence, website analysis fort any where managers need a dashboard with stats. It can connect to multiple data sources including most common DBs and even cloud services like Google AdWords, Google Analytics and Amazon S3 logs. This means that the cost of creating and operating excellent reports is much lower than with some other popular products by IBM, Omniture, Microsoft, Oracle and so many others.
I liked using their product a lot. In GameGround the product was mostly operated by one of our QA guys (in addition to his QA roles) that had some basic knowledge in databases and SQL and assisted by our DBA occasionally.
A few notes for everyone thinking of building a BI suite using SiSense and the like:
- Lean start-ups – abort here! The establishment of a BI product is lengthy, expensive and has a high learning curve. In most cases it would involve bring in an expensive contractor just to help you boot-start the thing. If your data includes a few thousands records you’re much better off with Excel. Excel can connect to most Databases and you can create filters and graphs and send them by email to the marketing/sales daily. It may sound “ugly”, but the time/cost it would take is a fraction of the time it takes to build a proper BI suite. BI reporting suites are not meant for lean start-ups! Starting thinking about a BI suite when you have some real customers and Excel’s abilities of crunching data are too low (over a couple of hundreds of data rows Excel starts slowing down to a crawl!)
- In many (read: most) cases the information you want to investigate does not exist in the Database. To create a report that shows you how many ppl clicked on ‘like’ and how many ppl uploaded a picture every day you’ll need to add code to collect the data. In many cases this involves code on both the website, the back-end services and changes to the database. No magic here – BI needs are met with development costs even if the BI person is part of the marketing team.
- The most important thing with BI is knowing the right questions to ask. In most cases the basic question of “if I give you the data you requested what would you do with it” is never asked. Ask questions that gets you actionable data. The harsh reality is that in too many cases reports were requested and were never utilized. Still producing those reports took a lot of effort.
- As a manager, if you ask for very detailed reports you’ll find that you drown in details and cannot get the whole picture. The whole point of having BI dashboards is that you can get the interesting point in 5 seconds. So, start by asking for basic stats that can be visualized over time. Then ask to get details on specific actual things you see.
- Building a good BI dashboard is, thus, an evolving process, not a project. The project would be to get to the point where you have the first 3 reports in place. Then you’ll want to continue develop more reports and improve on existing.
A few notes specific to SiSense Prism:
- They are still a startup themselves. Investing in a BI suite that may not be there in a couple of years is risky. Still I give the team at SiSense a strong plus. I’ve seen the work they’ve done with wix is pretty impressive.
- They prefer you to take a monthly paid subscription to use the product instead of paying a one-time. This is cool and allows you to pay-as-go and pay-as-you consume. It also reduces the cost of boot-starting a BI solution that mostly involves buying a strong server, paying for a contractor to help you out getting the thing to work etc.
- Their pricing plans are a bit problematic. Their most basic feature – viewing the reports in a browser – is only available in the most pricy plan.
- Their customer support had some serious issues at times. We got no response for a bug we had with their reports viewer software (we did not use the web-based version), and reverted to use a 500MB server software on each of the managers’ laptops…
Community Engagement services review: SendGrid, GetSatisfaction
2011 Aug 3Posted by on
A Review of Services I’ve used in GameGround – Part II – Marketing tools
This is the 2nd part in a series of blog posts reviewing several 3rd pty products and services I’ve used in GameGround and my take on them. The basic approach I’m taking here is the applicability of the product for a lean-startup that wants to move fast. In the last post I wrote about A/B/Split testing tools for the marketing team. This post is about Community Mgmt. Next up – Web analytics and BI reporting, OPS tools and of course, development infrastructure.
One of the first features every service has is sending email to customers. There are 2 basic types of emails to send: transactional and mass-mailing. Transactional emails are those produced as a result of a user action, like registration, friend invites etc. Mass-mailing are those when you invite your registered users to an event, a sale etc. So why not use your own corporate SMTP server for those emails? Because you are likely to find yourself in one of the many black lists of spam servers at some point. If spam filters on several servers worldwide find your emails to be spam or If 2-3% of your users mark your email as spam you’ll be black listed and will not be able to send emails from your company at all… bad idea. Other issues you’ll have to manage yourself if you don’t use a SAAS for this is managing unsubscribe lists (<1% of users on social networks unsubscribe in average) and email bounce list (~12% of email address users give on social networks are miss-typed or bogus). Managing those lists is mandatory if you don’t want to get black-listed.
We started with using MailChimp, probably the largest of several competing services, but quickly found that they will not send our mass-mails as they are afraid their servers would get black-listed. We then had the same issue with Constant Contact and CampaginMonitor. It seems that most EMS vendors send all email from a set of about a dozen shared IP addresses. Thus, they have to minimize complaints across their entire portfolio. Most EMS vendors require that you give your users either opt-in (“I’m willing to get marketing materials” checkbox on registration) or double opt-in (+email verification). And if the complaints rate resulting from your service is above a very low rate they kick you out. On our first campaign to just 1200 registered users we had a complaint rate of 1.1% and their acceptable limit was 0.2%… For a young company with little history records that is running its first campaigns the demanded ratios we not acceptable. And- we wanted to have an opt-out on sign up, not an opt-in. We got stuck for a few days till we managed to resolve the mess.
Then entered SendGrid! SendGrid is a cloud-based SAAS with a technology that seems to be far more resilient to black-listing. Their white-label feature allows you to bind your domain MX records to one of their servers with an IP address in cloud. This means you do not share IPs with others and do not need to comply with such low complaints rates. If you get black listed you can change IP address and/or domain name and get back on business in a matter of minutes. So we set up 2 accounts – one for transactional emails, that are less likely to cause blacklisting, and bound it the company’s domain name. Then we bought another domain ‘mailer1 –mycompany.com’ and bound it to the second account. SendGrid system appends an ‘unsubscribe’ link to your emails if you don’t do it yourself and they manage the lists for you – they won’t send an email to someone who unsubscribed, even your service did send them. You get a dashboard where you can see stats of your sent mails, bounces, spam reports etc. and fix your email templates as needed.
The integration with SendGrid’s basic SMTP service took us 15 minutes. They also give you APIs to sync user lists, send using predefined templates etc. but we haven’t got to use those. Pricing is low for what you get. It’s highly recommended to work with them and utilize their APIs to save you the need to write email templates and change them every other days according with the product needs. Let the product guys edit the email templates on SendGrid control panel. No code changes involved unless a radical change is made and different parameters are needed to fill-in the template. So much simpler to operate this feature too. Our email system is working fine with a delivery rate of ~95% on the transactional emails, which is excellent.
Now, how about some tips on how to avoid getting your emails marked as spam? This is a bit out of scope here – maybe I’ll do another post on the quests I’ve had to work-around the spam filters mine-fields. Meanwhile, you may want to read here.
- The product is a hybrid between an online feedback survey and a forums product. It has the disadvantages of both: you mostly get to hear only the ppl complaining about your product; irrelevant questions and remarks bloat the service with historical and irrelevant data.
- The loading of their widget is slow and cause long delays in the page loading -time. On some pages the feedback widget was the first thing to show up on the page (WTF???). We ended up writing a script that delay-load their script to bypass their mentality of “We’re THE product and our users are the websites using it”…
- The suggestion engine meant to prevent users from entering the same feedback over and over again is weak.
- You already have a Facebook page, a blog and the product pages with comments. Why do you need another place where ppl would talk about your product?! I would give this service a pass next time I’m around. Instead just open a feedback page, place a Facebook comments widgets in it and you’re done.
Marketing tools review: Google Web optimizer, Visual Web optimizer and Unbounce
2011 Aug 1Posted by on
A Review of Services I’ve used in GameGround – Part I – A/B/Split testing and landing pages services
GameGround.com is a service I’ve built during 2010 and was alive till mid-2011. I’ve managed this startup dev teams, developing a consumer facing social meta-game. This is a short review of several 3rd pty products and services I’ve used and my take on them. The basic approach I’m taking here is the applicability of the product to a lean startup that wants to move fast. I started writing it and quickly found out that it’s actually too long for one post. So I’m going to make it a series of post covering Marketing tools, Community Mgmt. tools, OPS tools and of course, development infrastructure.
Google Website Optimizer
Visual Web Optimizer
Unbounce is a landing pages SAAS. “… a self-serve hosted service that provides marketers doing paid search, banner ads, email or social media marketing, the easiest way to create, publish & test promotion specific landing pages without the need for IT or developers.” Yes! Landing pages for specific audiences and campaigns is an excellent way to drive traffic to your site. And Unbounce’s platform with its WYSIWYG HTML editor simplify the process even further allowing the marketing to create those pages and amange them as part of campaigns they are having without needing development involvement. They even give you multi-pages per landing page (e.g. a small website), a lead generation module, A/B/Split testing tools and other goodies. So far so good.
BUT! There’s a major but here: the SEO marks for those pages on Unbounce are extremely low. Search engines don’t like websites and landing pages that has only static content. They also don’t like it that the landing page in not under your own domain, but rather on Unbounce’s, and so they incorrectly see the landing page as a spam blog. This (among others, I’m sure) led us to get very few displays of our ads on Google Adwords and very few clicks coming from this major traffic source.
We ended up using some other desktop HTML editor to create a single-page site for each landing page. It was then uploaded to our live production servers, under the ‘/play’ folder, using a FTP we opened for it. This way the marketing team could create their landing pages according to the running campaigns and upload them to production with little or no dev/OPS involved. This is lean-thinking in its best – have as little ppl involved in each task. Ppl should mostly be able to complete their tasks end-to-end without needing to interface with others.
Node.JS – I Love this Technology!
2011 Aug 1Posted by on
Beware! A game-changing technology has entered the arena. The internet as we (developers) know it is about to change soon.
I rarely get to see a new technology that sparks my mind and keeps me late at night, trying to utilize it and doing something with it. It happened to me a couple of months ago when I first played the PC version of Angry Birds (2 white nights…!) and lately again with Node.JS. But this one is no game! To explain the thing I have to take you back in time to year 1999…
It all started when I wrote my first server for Exent’s Games on Demand platform. It was a large-volume file data server designed to respond to the requesting clients very fast and serve thousands of concurrent clients. We wrote the server as a kernel module and accordingly it was written in a fully asynchronous fashion. This project lead by my first team-leader, Amnon Romm, was certainly the most beautiful piece of code I have seen to date. [us, developers, can see beauty and ugliness in simple code. It’s a special gift we have that a non-developer will never understand… J Code is actually mostly old and ugly. If you write something and a colleague comes to you and tells you your code is beautiful – this would be the BEST compliment you can ever get. Really]
Since then I’ve seen so many other servers. Some of them, like Check Point’s Firewall, definitely have a good asynchronous architecture (even if the code is somewhat ugly…) But one thing I could never figure out how come the whole WWW (the browser part of the internet) is running on top of badly designed synchronous servers. Maybe it’s the basic design of the [wonderful] HTTP protocol that is request-response based. Maybe it’s us, developers, who find it harder to design and code asynchronously. Maybe it’s because in the early ‘90s when internet standards were written, running a CGI process on a UNIX was the main way to handle HTTP server requests, and we just never wanted to stop supporting those standards… Anyway, I figured out that all the most common servers – Apache, IIS, JAVA based web-services, the standard .NET stack, Django/python, PHP, Ruby and almost every other piece of HTTP server out there are written to run on a synchronous environment. Every request is either served by a new thread or a thread from a big thread pool. Such a thread is executing the request and response stacks waiting for resources from the DB, from the disk drive, from a memcached service they call etc. And each time they go to sleep waiting for a response from the device to arrive and context-switched out to give another thread some CPU time. This means that heavy-load servers spend MUCH of the CPU and memory-bus time switching threads. The simplest server written with (the newly designed) .NET/WCF can have up to 100 threads running on a dual-core processor. The result is that a strong server can serve only a few thousands of clients/browsers concurrently. So much CPU time and money is wasted.
Another issue is the usage of high-level interpreted languages to write the internet. From JSP to PHP to Python. Most of the internet is written in scripting languages because it is easy to write and easy to deploy. But everyone knows it is running slow. It’s a balance development managers take – code fast and get it out the door. They say “We’ll have other ways to speed up the beast after it’s already out” – clustering, stronger hardware, another caching layer etc. Anyway – once the version is out its now the problem of the IT guys to meet the SLA. WTF??? Some real efforts were recently made by Facebook to speed up PHP with their release of Hiphop – a server add-on that transforms PHP scripts into highly optimized C++ code and then uses g++ to compile it to machine code before its run. They say that on average it reduced CPU usage at Facebook by about 50% (!!!) and that WordPress 3.0 is running x2.7 faster under Hiphop. Wow! Impressive!
But what if we could solve the synchronous design in a similar way? The problem is we cannot – we’ll have to throw away all the code that was ever written and start fresh. That is because asynchronous code cannot call code that blocks and all that code out there is blocking.
Enter Node.JS (Start fresh!)
So what can you do with it? Well, basically everything you can do with Python, Perl or Java – client code, server code. But the goal is definitely server code. Blazing fast web-servers that can handle x10 more traffic and do it much faster. This can serve not only “regular” browser based traffic, but can also be utilized to stream music and video, used for sharing applications, reverse proxies etc.
There are a couple of things you need if you want to have a team of developers starting to work on your next big thing. First you want a proper a development environment (use Eclips with Google’s V8 plugin); then you need a Unit Testing framework (use Expresso); an application server with MVC and templating support (use Express); an ORM/Hibernate-like tools to ease coding on the DB (see MongooseJS for MongoDB and SequelizeJS for MySQL); a library of utility Modules to copy-paste from for almost every basic need (see NMP Registry with almost 3000 entries << try searching “facebook”); and a cloud based app-engine to deploy your application on, preferably for free (I found 11 such services but Nodejitsu and Cloud Foundry seems to be the most advanced). Let’s not forget that a strong development community is also very important (NodeJS main newsgroup: ~50 threads/day; ExpressJS group: ~10 threads/day ; Linkedin NodeJS group has 832 members; StackOverflow NodeJS tag: ~13 tags/day). I think we are good to go!
Node.JS thing is catching fire these days – this Google Trends view clearly shows how fast Node.JS is soaring and that it’s now almost as big as Roby on Rails. Many new startups are seeing this as a great opportunity and are developing on Node.JS. It fits so well with the lean-startup concept. One coding language for both front-end and backend >> one developer can write the whole feature, end to end. And no more translations between XML and JSON. Now everything is JSON in all application layers. No more
Some giants have already decided they are joining the party. Microsoft has recently announced it’s going to support Node.JS on Azure (and Visual Studio, for sure). VMWare is already supporting the Node.JS deployments on their cloud services – CloudFoundary.
Interesting blog posts if you want to further read –
- 6 months with node (a thank you note)
- Why did NodeJS become popular faster than its peers?
- What it’s like building a real website in Node.js?
- What are the benefits of developing in Node.js versus Python/Django?
p.s. I guess some of the readers of this post are saying “this guy is crazy! He’s taking an immature technology and convincing it should be used in production today. The risk is too high yada yada yada…” I agree. The risk is high. If you don’t have strong devs that can master a new technology and face some difficulties then you should stick with the usual Django/Rails/GWT. If you have strong devs the up side of this technology is great and I think it’s mature enough for most tasks. Especially if you start something new.