Monday, 29 September 2008

The Missing Link in Business Evolution

At a company gathering in the beautiful highlands we were discussing the role data analysis played in a company's value proposition.  So many company's fail to understand the value of their networking database, their email trail and other important client interaction.  Making better use of their database would allow them to give better value to the client, potentially market more efficiently and at the bottom-line - make more money.

Not only is it important to be able to spot trends in the data it is also important to be able to create a process around that information which will augment any existing business processes currently in place.  Making efficient use of your network / company information is what will separate an OK business into a fantastic business as you will be able to work more efficiently and be able to measure your success - a very important attribute in any size business.

The big question for you all today is - what data is lying dormant in your company?

Thursday, 25 September 2008

Knowing the odds

Today in The Register was a great article on cheating on online poker which was only discovered by some very interesting data mining.  It highlights the importance of analysing data for trends but also looking for those events which are so unlikely it may highlight something wrong with your business.  In this day and age everything is monitored from sales to emails - so there is little excuse for not doing some simple analytics on these figures.

Let us take an example such as sales.  In this digital age it is getting harder just to pick up the phone and speak to someone but it is still important to focus on your key performance indicators. The thing to do is to write down all the information that is available to you, for instance time email sent, time reply arrived, who it was to etc etc.  You can then write down what the questions are you want to answer, for instance who is the slowest customer to reply? Who never replies? If I send an email to a customer at 5PM on a Friday what are the chances I will get a reply?  These are important questions as you can then hone your sales process so that contact is made at the right point, you know when you should be chasing and when you should be waiting.

In this brave new world, information is king and whether you are a large or small company it is important you use every aspect of the information you have to your advantage.  Your task for today is write down all those processes in your business which you could measure or have questions about?  Link the questions with the measurements and then find a way to get those figures.  Then with a little intuition set the context for those figures - context is KING in statistics.

Good luck!

Wednesday, 24 September 2008

The wonder of you!

I think it was the late Elvis Presley that sung "The Wonder Of You", if may of course been just a line, but I came across a fantastic technology last night which made me thank - that is a great piece of work.

Many of the interfaces we build are web-based as we like to combine the best in breed of all languages for whatever purpose we are putting them to. I had my CMS in PHP and my R server both happily chugging away but I was not satisfied. They need to talk with one another. Initially I picked up the client (written in Java) and started the lengthy translation to PHP, after about an hour I thought - why? There has to be a better way?

This is where a little Googling can get you a long way and I came across the PHP-Java-Bridge software - fantastic! A quick compilation, a scratch of the head over conflicting php.ini files and Andrews my uncle - there it was a beautiful piece of PHP running through a JVM answering the calls of a wild R server.

All I can say is my faith in code reuse is restored and I went to bed a happy man. It bought home to me though a very serious point which is that in business you always have legacy code which most developers will tell you "rewrite", "re-engineer" or some other "re-" word which crosses their mind at the time. The real answer is in this day and age and if you have been careful about your previous technology choices, i.e. open standards, formats and mature languages, you can normally find a way to continue to use old software but with a new facade, saving you time and as a value proposition - saving you $$$.

Please write and tell us about your code re-writing experience for your business - was it expensive, do you regret doing it, what would you do different?

Tuesday, 23 September 2008

The malaise of Excel

When our team first chose to use the Open Source Product R for the majority of our work we did not realise the hornets nest of problems it might create.  Not due to the application but to our assumptions coming into using the application.  Even though we had had many years working in C, Java and other numerous languages the first problem that knocked us over was of floating point arithmetic.

For those of you who are not sure what I mean take the following question is: SQRT(2)*SQRT(2)== 2.0.  Now for all the Excel users out there everyone will cry "Yes" for those of you used to seeing the post "See FAQ 7.31" you will know that applications such as R and S-Plus do not see the world in the same way and thus return FALSE.  

The reason is to do with the fact that inside your computer you have a limited amount of bits with which to represent your floating point number.  You can find more on floating point representations in the paper David Goldberg (1991), “What Every Computer Scientist Should Know About Floating-Point Arithmetic”, ACM Computing Surveys, 23/1, 5–48 (courtesy again to the R FAQ).

So what is the (decimal) point of it all?  Well when dealing with applications that know about IEEE 754-1985 you can normally set the desired amount of precision (in R this is normally an argument called tolerance for example in all.equal).  My personal favourite is to use a threshold figure to like 0.00001 to check against.  But you should always be on the lookout for such precision problems.  A note on the difference between accuracy and precision I will leave for another day but you get the idea.

So finally don't complain when applications are built to properly represent your floating point numbers, recognise that it is doing its job and praise the writers for sticking to standards (again, a topic for another time!).

Happy number crunching!

Monday, 22 September 2008

Welcome

Welcome everyone to the new Gulfstream Software Blog.  The goal of this blog is to try and communicate some of the exciting events and tools available in the area of data research.  Our own groove is using using a mixture of Proprietary and Open Source software to help make your everyday business processes more efficient.  

Our simple motto is "use what does the job best", simple and effective.  Over the last year and a half we have developed a number of tools in interesting areas such as deterministic clustering, Psycho Search Tools and other exotic areas which we hope our customers will find useful but this blog is more about helping you with fantastic hints and tips to aid your everyday work.

So I hope you will subscribe to this blog, you can expect a new article once a week.

Thanks for reading,

The Gulfstream Team