Ethan Ram’s geeky blog on the seam of technology and product management.
The last week hasn’t been an easy one… I’ve got a brand new Lenovo X220 – and it’s giving me a hard time. For over 10 years now that I’ve always had a ThinkPad laptop (except for a couple of years with a MacBook – but this story is for another post) and I was always very happy with it. But this time…
Boot is stuck for over a minute between login and getting to see the desktop!
My laptop gets stuck forever 2-3 seconds after I go wireless on my workplace Wi-Fi network!!
Changing writing language too often doesn’t work – I’m getting stuck in Hebrew forever!!!
Fingerprint software is dead…
A PCI port is missing a driver and Windows keeps complaining about it…!
The Lenovo System Updater gets stuck forever when I run it to check maybe I’m missing some updates… errr.
In general – I get to the point that I have to reboot the computer 2-3 times a day… err err errrrr!
…And I’m a software guy, expert with Windows Internals – right? So I must be able to find the cause of these. But hell – it’s a new computer running latest software – I don’t feel like spending day resolving these. Or – maybe return the thing to the IT department and let them break their heads on this (still I will have to take a day off or work on a temp PC that does not have my configuration… bad idea!)
WOW – after a couple of days I hit some major frustration loosing over ½ an hour of work. ((Remember this was actually common in the ’90… but things surly have changed since… or maybe this is only me??))
Then I remembered this slogan from a young and apparently very successful startup from Tel Aviv called Soluto – “Soluto is bringing an end to PC user frustration …killer technology”. They have millions of installations and got so many positive reviews and prizes. It seems that Yishay Green, Roy Karthy and the other guys there are really managing to create a buzz and get some heavy funding. Actually, I remember I even tried one of their early betas after hearing one of their founders talk about his former startup successes… I have to give it a try. Maybe 15 minutes with this and I can save the hours of digging into finding what’s wrong with my PC.
I’m not going to write a full review about my experience. Just a few bits. They have a smooth installer and the UI looks great although it is a bit slow to come up with answers. (I even took some time to help them defining what some of the more obscure apps I’m running on my laptop are – after all it’s much of a community project.)
But then I hit this…
So they found that Google Chrome is running in the boot taking 29 seconds of my boot time. But they cannot do anything about it. WTF?? This is simply wrong. I actually ran a boot profiler myself (SysInternals ProcMon) and Chrome does not run on boot at all… And why do they say they cannot do anything about it? Can’t they help me uninstall it?…
They are offering me to remove some 10 pieces of software from the boot process, most of them with no clear explanation on what they do, and each taking a full 0.1 seconds of my life… Who cares?! But check out this suggestion – “pause it unless you connect to a network on the internet using an Intel wireless network adapter” – Even a pro like me got confused for a minute. This is cryptic Chinese for 99% of the world population. I’m sure that those poor 10% of people who choose to follow their advice and disabled their PC wireless have really got no frustrations now…
So it’s offering me to remove 8 of the 18 plugins I have in my Chrome browser. They say I have 2 Chrome toolbars and six more plugins that I can safely be removed… WHAT??? Toolbars in Chrome?! In Chrome there are no toolbars, like they have in (poor) Internet Explorer; and those 6 plugins I use are very useful and consume somewhere between ZERO to NOTHING processing time. Why do they suggest I remove them? Why do they tell me 26% of users actually disabled their Multimedia Plugin (and forgot about having media playback capabilities in the process)!?
I have a few more examples but I think the point I’m making here was well understood. So I’ll shortly conclude with this strange behavior: it takes Soluto 6 seconds to open its own About dialog… At least in the first time after every boot.
As you can understand it did not find my issues, or help me fix them, so we parted like friends (somewhat of a frustrated PC user friend, though).
OK – OK. I know I’m probably not the average guy and probably I’m not their target customer and/or my new PC is not their target PC because it’s new. I can even think of some cases when I was called-in to help fix a dysfunctional computer where this utility application could actually do some good. Still I felt I’ve wasted time playing with it.
So here is advice for what could be some excellent features for the next Soluto version that may actually make a difference. And if they don’t add them to Soluto, still, my dear readers, you can follow and fix your frustrating computer yourself.
Enough said. I still have a couple of issues on my new laptop to resolve today, although most issues I’ve already found and fixed 🙂
One twists I really liked in Soluto: They have this little tray-icon menu where one can click “My PC Just Frustrated Me”. I’ve clicked it a couple of times not knowing what will happen. It seems to be doing nothing – no dialog opened, no thank-you message. Nothing. Maybe they are just collecting data for their next version or something. Maybe a bug? Strangely, after clicking it I felt somewhat relieved. Like a small steam release.
This is the 4th part in a series of blog posts reviewing several 3rd pty products and services I’ve used in GameGround and my take on them. The basic approach I’m taking here is the applicability of the product for a lean-startup that wants to move fast. In the last post I wrote about Analytics and BI Reporting tools for the marketing team. This post is about Monitoring the health of the system server – for the OPS team. Next in the series – development infrastructure.
Nagios probably is “The Industry Standard in IT Infrastructure Monitoring” as their slogan says. It’s very popular among IT stuff and can be configured to monitor and alerts about up to 40-50 servers. So even a medium size company can use it. It’s free server software – basically a scheduler that executes service checks against installed agents and tests against network devices, reports back the results and raises alerts above predefined thresholds. There’s also a comprehensive list of extensions or plugins written by the community that can be utilized to monitor about anything you’ll ever want.
It’s easy to setup Nagios to watch for server disk-space, CPU and the existence of certain services. The difficult part is to create checks that would alert you if internal parts of the software behave irrational and users are not seeing what they should. E.g. certain transactions do not end in time, server response time for certain requests is going up, users suddenly cannot see their friend’s list etc. These are much harder to watch. To monitor these you’ll need to write code both on your back-end servers – special functions (REST/WSDL) that would do some internal testing and return true/false accordingly. Nagios is able to call such functions periodically and alert if they failed. It’s an evolving process: You’ll see your systm fail without Nagios alerting about it and then add more of those checks till it functions well.
So- It’s wiser to add some testing functionality on design time: plan your server modules to have Nagios testing APIs. You’ll also need to watch that some of your 3rd pty providers are working right: If the A/B testing API you are using is down then your site is probably down too. If your Content Delivery Network is down ppl are not getting to see your website, although everything is functioning on your side.
Nagios – the CONS:
I don’t know of a good alternative. But I would like to see something that combines system health alerts with Syslog analysis and a real-time configurable dashboard. Any ideas?
If you want to have a good insight into what’s actually happening in your servers you must check the different servers’ logs. Getting all the logs from all the servers into one place and automating the search for errors, exceptions and irregularities is key to having a healthy working production environment. First product we checked following warm recommendations from friends was Splunk. It has excellent easy-to-use web-interface and the setup is very easy (assuming that your servers are written and configured to upload syslog/log4 to a central server…). But Splunk is VERY expensive, even for a small server setup like ours they asked for something like $6000/year. The free version is only good for internal testing and running on-top of QA systems. For production you’ll need the enterprise version. It does not make sense to pay that much in a startup… So we checked Kiwi Syslog.
Kiwi Syslog is a relatively small piece of software made by a NZ company. Their main interface is based on a Windows installed client. But they now also have a web-based dashboard that gives you the most important features. It’s easy to setup and work with. It’s cool. And it costs like 2% of Splunk’s cost. Go Kiwi Syslog! Go!
Working with a Content Delivery Network is an important factor in speeding your pages loading time. When we tested before-and-after we saw a dramatic decrease of first-time page load from 3-4 seconds to 2-2.5 seconds for US-based users. With later widgets and pages the load time was about 30% faster. This is a lot! The other reason you’d like to have a CDN is that it’s going to take a large percentage of the traffic from your servers – so you’ll end up having less servers and pay less on traffic.
The basic service a CDN offers is the speeding up of static content (Imgs, CSS, JS files) delivery. The advanced services CDNs offer are media streaming and something called Whole Site Delivery – out of scope for this blog post. For the small site/service you’re going to pay $1000-$2000/month for the basic CDN – it may not be too bad considering the reduced costs on servers and traffic.
If you know you’re going to use a CDN you can write your code and delivery procedures in a way that starting to use a CDN would just be a flip of a config file entry. If you already have a website/service functioning without a CDN you’ll probably need to do some work to separate and version the static files correctly and add proper configuration everywhere. So, with the right design you should be able to integrate with a CDN, change CDN or stop working with a CDN in a matter of minutes.
So the story goes like this: We decided we had to have a CDN because every millisecond of page load time is critical. This was before launching our initial service. We went shopping and were surprised – it seems that most of the bigger CDNs were not willing to work with us at this stage at all. Even the local rep of the local Cotendo (a startup sharing a VC with GameGround) never returned a phone call… Luckily the local rep of Limelight was willing to take the deal and after a couple of weeks on negotiations we switched NO the config and it was working well (we did have a couple of config issues – minor faults on our side)
Q: Should a small lean-startup deploy a CDN as part of their initial release?
A: NO NO NO. It’s expensive and the signing up with the local representative of a CDN will consume too much of your time.
Q: Should a lean-startup write their code with a CDN in mind?
A: Yes! Sure! This will allow you to speed up your site and offload traffic if and when your site/service is showing some signs of success. Coding with a CDN in mind won’t make it slower anyway.
Q: Can you give some hints on how to design it right to work with a CDN?
A: I promise to have a post about it later on… << but if you have a specific Q – ask it in a comment below
Q: Are there no free/cheap alternatives?
A: There are! Check out this post about using Google Apps Engine as a free static data CDN. Also – this post about using DropBox as a free CDN solution. Note that if the delivery of the resources from those unofficial-CDNs is not faster than delivering them from your own site then adding a CDN configuration might actually slow down your site. Be ware!