2015-02-22

The Data Warehouse, a work in progress

Last Friday Petter left the company and the Data Warehouse. He is one in a line of very talented youngster who over the years  have been working with and developed the Data Warehouse. When Ulf Davidsson and I 2001 created the Data Warehouse I know it was going to make an impact on the company, but I did not anticipate the success it has become. And this is much due to the great guys who replaced me and Ulf, we did a decent job too, but that is another post/story.


Camilla was the first one to join us. At the job interview she claims I said ‘If you do not know SQL well, you will be smoked. By the way we have a little test for you, you have these tables, now a production planner wants to see the weekly  consumption of components from warehouse XYZ. Write a SQL that show us this.’ This was not nice, Camilla was not prepared and it was not a trivial task, but she nailed it. Today Camilla is Team Lead for Business Intelligence at the company.
When Ulf left the company, Johan was hired. Johan was a young mathematician, with no special IT skills apart from playing computer games and some Matlab knowledge. Johan was a nice and intelligent guy, so we hired him. Johan joined us in the summer and he worked a week before two weeks of vacation, I told him he had to learn SQL during the leave otherwise he had to go. Johan came back not only knowing SQL inside out, but understood SQL better than anyone I know, after two weeks! He must have been working very hard that vacation. When Johan left, Camilla and I demanded we should have Petter transferred from the logistics department. Petter is a mathematician and a civil engineer, he also has his own football team where he is a bone crusher in the defense, (Johan is a forward in the team).
When I left, Feven replaced me, she has a solid IT education and is very sharp. You only explain once and she get it and do something better of it. After a week she managed the Data warehouse operations, and soon started to develop BI applications. She was project leader and developer of a new Data Warehouse for a factory in Milan, Italy. Apart from that she did a few PHP hacks in the ETL engine, fixing some old bugs of mine.
When Feven left, Henrik was recruited from another company in the Atlas Copco group. Henrik is a civil engineer and yet another brilliant co-worker of the Data Warehouse. Apart from the normal Data Warehouse operations and development Henrik is doing a lot of Qlikview development, and some PHP hacks fixing old bugs of mine.
I feel privileged to have worked with these five brilliant guys (Camilla, Johan, Petter, Feven and Henrik). It is a joy and pleasure to see new people taking on something you have created to new levels in ways you never anticipated yourself. I am sure Ulf Davidsson is proud of this progression of the Data Warehouse too.
It might look suspicious none of these guys stayed long. Camilla is still in the company, but now responsible for BI, and that is different from being a BI developer. They all being young and talented, it is just natural they move around. All of them have made significant contributions to the Data Warehouse.  I’m sure there will be a new brilliant member of the BI team replacing Petter, working together with Camilla and Henrik making the Data Warehouse better.

2015-02-01

The graph now with Qlikview activities


The Data Warehouse monthly twitter graph is now enhanced with Qlikview user activities.

As you probably know by now the Data Warehouse is a Business Intelligence data storage system, it is GUI or viewer agnostic, viewers of users choice are welcome. MS Excel is the most popular one, but more and more users prefer Qlikview apps.
This monthly twitter graph is an attempt by me to show usage of the Data Warehouse and most usage is covered is the mQuery (millions of SQL queries, the lime-green line), But you do not see the Qlikview user activities, since Qlikview has it’s own proprietary storage. Data Warehouse information is exported to Qlikview,from there you have to measure activities via Qlikview. This is maybe thebiggest drawback of Qlikview, it is a closed environment, once the data is imported to Qlikview you can only access the data via Qlikview. You should not import the data directly from source systems into Qlikview, but use an open storage in between if you like to use other viewers together with Qlikview.  
I have used the calls figures from the Qlikview session logs, to illustrate user activities. The dotted dark yellow line represents Qlikview calls in the tens of thousands. If this is good figure to show Qlikview user activities I really do not know, if you have opinions on this please tell me.
You follow the Data Warehouse tweets here.


2015-01-28

Allan Turing the movie

Yesterday I saw the imitation game. I fancy the movie although it wasn't historical very correct. It was nice to learn about Joan Clarke (played by the lovely miss Keira Knightley), I had never heard about miss Clarke and the part she played in breaking the Enigma code before.
It’s a pity there is not very much about the other things Turing did apart from code breaking. He was a phenomenally clever computer builder, I would have love to see the mercury lanes used as acoustic high speed memory in some of the very first computers, and something about his work on morphogenesis. I have read he was pestering people discussing color patterns of cowhide, that could have been illustrated in a movie. It is probably harder to illustrate the fact he was a darn good programmer.
All in all it’s a good movie that I can recommend.

Not to diminish the work done at Bletchley park, but Arne Beurling cracked the harder ‘G machine’ (Geheimfernschreiber) with just pen and paper in two weeks giving the Swedes good access to German secret communications going through Sweden until 1942 when Finish military revealed this to the German Wehrmacht.

2015-01-25

Sunday morning

Stockholm 2015-01-25. Cautious swan on thin ice, the ice squeaked for every step.

This morning I replace the internal Data Warehouse server switch. The old one started to misbehave last week, so I replace it with one from my private stash. While I was alone in the network I upgraded the japanese Data warehouse server to Ubuntu 12.04.05. I still find it cool to administer a server on the other side of the globe.  

2015-01-18

PHP 7



hill.jpg
Stockholm January 2015, view from the office.


The other data I found and read the PHP RFC, these includes changes already made for PHP 7, the next major version of PHP. There is no really exciting things for me. My use of PHH is very unusual, I use PHP mostly for shell scripting and an interpreter for my Integration Tag Language, the latter use is probably the most awkward use of PHP imaginable. My biggest concern for every new major version of PHP will the SAP RFC extensions work, SAPNWRFC and SAPRFC, for PHP 7 I’m sure some tinkering will be needed.
My own wishes for PHP are better ways of parse and execute PHP code dynamically, eval is what I use and I just don’t get my head around that instruction, it is trial&error sessions each time I use it. Better parallel execution would be nice, i.e. simpler parallel invoking as it is today it’s fairly complex to fork children or sub-tasks in your PHP code and communicate between them. Tail call optimization is a feature I would like to see in PHP, since it allows for more efficient recursion code and I like recursions.


PHP Deprecated:  iconv_set_encoding(): Use of iconv.input_encoding is deprecated in scriptC1.php on line 49

If all deprecated code is removed from PHP 7 I have some work to do, especially UTF-8 related code. I’m in favour of removing deprecated code, let it live over two major versions then remove it. I chosed PHP for my Data Warehouse project not because it was the best or most stable language, but because it was new and there was a vibrant community pushing  the language forward, PHP looked fun. If removal of deprecated code mean simpler maintenance and a lower footprint of PHP I do not mind doing some extra work, which will make my code better in the end. You just cannot add new stuff, obsolete code should go away.

2015-01-09

Happy birthday!

New years eve I wrote it seems to be a new year every year, the same is true for birthdays, I grow one year older every year. At my age birthdays are not happy anymore. A quiet dinner with my sons maybe, I do not know yet we’ll see.
This week religion showed its ugly face again, and once again we see the profiteers of evil. I can see the hunt for the terrorists more or less live on TV, why I ask myself. No matter the reason it gives a hell of a lot to the media (people)  more work and (extra) income, self proclaimed spokesman lead manifestations for the free word etc, xenophobics can spread their venom, all profiteers of evil. The other day there was a lady on radio live from Paris ‘It is time to ban intolerant opinions’, yes she was serious. Today we have profiteers of evil speculating together with live pictures from Paris on zillions TV channels. I do not need this.


I use to talk about profiteers of tragedies, you know all priests, therapists et al that shows up as soon there is a tragedy, to offer professional help, not so much for those directly affected by the tragedy, but the surrounding masses of vultures. Now we see a more sinister next of kin, the profiteers of evil are coming of age.
My heart do not bleed for those merely remotely affected of calamities or the profiteers. My thoughts goes to those really affected and those who tries to protect us and catch the bad guys.
What is next? Journalists embedded with terrorists?

I had in mind writing something completely different, this is far from what I usually write. But it is my birthday and my blog so what the...

2015-01-07

Importing Qlikview logs into MySQL

I have for a very long time tried to come up with ways to measure and compare Business Intelligence Activities. So far I have not come up with something that holds water. It is complicated just to measure the activity in one BI system alone, compare two different systems are even more complex. What I have in mind is to capture some figure of all activities in the Data Warehouse, this will at least give an indication of the use of the Data Warehouse.
The stats I capture today is batch jobs and MySQL queries. This is just an indication of the activities it does not give any hint of the quality or the value of the system.
Qlikview is becoming more and more popular among the Data Warehouse users as a viewer, so I felt it is appropriate to include Qlikview in the overall activities of the Data Warehouse. And this is what this post is about.

When I created The Data Warehouse Movie I had a hard time to parse the QV log, the columns in the log were not separated, it was just space in between and column entries could include space and missing entries was just missing. This time I hex displayed the log and I found columns were separated by hex ‘09’ a whitespace (tab) character, much simpler to parse the logs with that knowledge. The next hurdle the Data Warehouse runs on Linux, Qlikview runs on Windows, I do not want to set up any procedures on the Qlikview server, but decided to grab the log files via a CIFS mount. I created this ITL procedure:
I’m very happy with this procedure, when I started I thought it would be very hard to import the logs, this is a walk in the park, kids play!
The second <action> tag in the <init> section specifies what logs should be imported, by specifying:
<action sync='yes' cmd='ls @WINMNT/Sessions_SSCSSEQVS002_2014-*.log > @J_DIR/logs.txt' dir='@J_DIR'/>
I downloaded all Qlikview session logs from 2014 in one go, it took some 140 seconds.
As it is setup now I will schedule this job at 01:00:00 and import yesterday’s logs.

I still like the Integration Tag Language, it’s simple, succinct and does the job. It should not be hard to read and understand the procedure.
The task that took the longer time was to define the MySQL table:

CREATE  TABLE IF NOT EXISTS qvlog
 (`ExeType` char(5),
 `ExeVersion` char(20),
 `ServerStarted` timestamp,
 `Timestamp` timestamp,
 `Document` varchar(200),
 `DocumentTimestamp` timestamp,
 `QlikViewUser` char(12),
 `ExitReason` varchar(64),
 `SessionStart` timestamp,
 `SessionDuration` time,
 `CPU` int unsigned,
 `BytesReceived` int unsigned,
 `BytesSent` int unsigned,
 `Calls` int unsigned,
 `Selections` int unsigned,
 `AuthenticatedUser` varchar(30),
 `IdentifyingUser` varchar(30),
 `ClientMachine` char(56),
 `SerialNumber` varchar(32),
 `ClientType` varchar(64),
 `ClientVersion` char(10),
 `SecureProtocol` char(3),
 `TunnelProtocol` char(3),
 `ServerPort` int unsigned,
 `ClientAddress` int unsigned,
 `ClientPort` int unsigned,
 `CalType` char(16),
 `CalUsageCount` varchar(25),
 Primary key (`SessionStart` , `AuthenticatedUser`)
 );
I trial&error the table definition a few times until loading was OK. If you happen to know the proper table definition please drop me a line.

Now you may say ‘This seems to be a bit awkward, why download Qlikview logs to MySQL, why not use Qlikview?’. That is a good question, we already have a Qlikview app for the logs, but I have the Data Warehouse twittering app written in ITL and all the other stats in MySQL, so I thought it would be nice to have the Qlikview stats in MySQL also.
If and when I have verified the downloaded data and implemented it in some app I probably write a second post.