In principle identical, fundamentally different

The data warehouse is a very robust piece of software, it got it's rough spots and edges, but it hardly ever fails. So it's not surprising Henrik the newest in the line of bright youngsters working in the Data Warehouse have not been exposed to bugs before. But now he found a gronk, a huge SAP data extraction fails silently leaving no traces and no data. I looked at the problem, it looked like a no-brainer, so I just added a few tracing statement in my development environment and everything worked well I got about seven million rows from SAP, so I assumed I in the past had fixed the bug but not moved the module to the production environment. I moved the development module into production but forgot to test, which of course had consequences, some hundred extraction jobs failed that night. We had to roll back the original version of the script and Henrik had to rerun the failing jobs. Later when Henrik discussed the bug with Camilla (the BI team lead) he said "It's a bit puzzling, it works in the development environment and Lars told me it is in principle identical with the production environment." Camilla burst into laughter "when Lars says 'in principle identical', he means fundamentally different".
I have had another look into this bug and it looks like we have a problem with the str_replace function of PHP. With a bit of luck I will hunt this bug down during the weekend. I have promised not to touch the production environment though.

Many years from now, some years after I left the company the first time I had a chat with the IT manager and he said "You are great and you have built us the best production environment for our ERP systems in the world, but the environment have stabilized a lot since you left".
It is very hard to combine innovative, stable and low cost. You can pick any two, but you cannot pick all three. When I started the Data Warehouse project I decided to go for innovative and low cost.The first because that’s what I am, the second out of necessity, I didn’t have any funds. But still the Data Warehouse is remarkable stable as long as I do not interfere.


The log comes in colors like a rainbow

The log of the data warehouse ETL engine has been black and white up until now. Some years ago I added colors to the log. However the same text string printed to console was also written to the archive log where the color attributes looked awful. But a few weeks ago while jogging an early morning I got the idea of filter out all attributes with a regular  expression.
This is the original code writing the console log message to the archive log
fwrite($this->logHandle, "$lmsg");
This regular expression sifts out all color attributes:
fwrite($this->logHandle, preg_replace('/\033\[[\d;]+m/', '',"$lmsg"));
Simple as that:)

This is a black and white example of the console log:

And this is the color version:

This is not the most useful hack I have done and it isn't exactly a rainbow, but the console log is easier to read.


Can you explain me

As I have written before, the 'company language' is English. This means the most used language is funny English since only a few of us has English as the mother tongue. I both write and speak funny English as I'm doing right now. Funny English is a language with many dialects. One grammar rule I think stems from central Europe (German,Flemish,Dutch) is to omit the introductory 'to' when you have a transitive with an indirect object. E.g. in proper English 'can you explain to me ...' becomes 'can you explain me ...'.
To me as a Swede omitting 'to' sounds weird, but it has become so common at the company I have heard native English speakers yield to this funny English grammar rule. I suppose this is one way languages evolve. I'll bet my 5 cent the introductory 'to'  will disappear in English in the next 100 years.


Do not edit!

Sometimes I feel things get overworked.

The other day I did a big mistake when upgrading an Ubuntu server running PhpMyAdmin from version12.04 to 14.04. I got the question ‘Do you want to keep the PhpMyAdmin config or replace it with a new config?’. I replied ‘replace with new’, that was almost kissing PhpMyAdmin goodbye.
As I recalled it PhpMyAdmin config was a simple file taking a few minutes to fill in and then just restart apache. But not so anymore, in Ubuntu it seems PhpMyadmin is scattered all over. I found several config.inc.php files and I tried to configure PhpMyAdmin by updating them one at a time but none of these updates worked. Finally I gave up and edited the file /usr/share/phpmyadmin/libraries/config.default.php. You see the header of the config.default.php file above.
I do not see the problem of having one configure file, in fact that is as it’s supposed to be, one simple text file, where you enter simple configure declaratives to adopt the software to your environment and liking. Splitting a simple configuration file into many just makes the configure process harder not simpler. To me it seems someone forgot was configure is about, and went over the top to create the perfect configure process. Often I found less is more.


The Data Warehouse, a work in progress

Last Friday Petter left the company and the Data Warehouse. He is one in a line of very talented youngster who over the years  have been working with and developed the Data Warehouse. When Ulf Davidsson and I 2001 created the Data Warehouse I know it was going to make an impact on the company, but I did not anticipate the success it has become. And this is much due to the great guys who replaced me and Ulf, we did a decent job too, but that is another post/story.

Camilla was the first one to join us. At the job interview she claims I said ‘If you do not know SQL well, you will be smoked. By the way we have a little test for you, you have these tables, now a production planner wants to see the weekly  consumption of components from warehouse XYZ. Write a SQL that show us this.’ This was not nice, Camilla was not prepared and it was not a trivial task, but she nailed it. Today Camilla is Team Lead for Business Intelligence at the company.
When Ulf left the company, Johan was hired. Johan was a young mathematician, with no special IT skills apart from playing computer games and some Matlab knowledge. Johan was a nice and intelligent guy, so we hired him. Johan joined us in the summer and he worked a week before two weeks of vacation, I told him he had to learn SQL during the leave otherwise he had to go. Johan came back not only knowing SQL inside out, but understood SQL better than anyone I know, after two weeks! He must have been working very hard that vacation. When Johan left, Camilla and I demanded we should have Petter transferred from the logistics department. Petter is a mathematician and a civil engineer, he also has his own football team where he is a bone crusher in the defense, (Johan is a forward in the team).
When I left, Feven replaced me, she has a solid IT education and is very sharp. You only explain once and she get it and do something better of it. After a week she managed the Data warehouse operations, and soon started to develop BI applications. She was project leader and developer of a new Data Warehouse for a factory in Milan, Italy. Apart from that she did a few PHP hacks in the ETL engine, fixing some old bugs of mine.
When Feven left, Henrik was recruited from another company in the Atlas Copco group. Henrik is a civil engineer and yet another brilliant co-worker of the Data Warehouse. Apart from the normal Data Warehouse operations and development Henrik is doing a lot of Qlikview development, and some PHP hacks fixing old bugs of mine.
I feel privileged to have worked with these five brilliant guys (Camilla, Johan, Petter, Feven and Henrik). It is a joy and pleasure to see new people taking on something you have created to new levels in ways you never anticipated yourself. I am sure Ulf Davidsson is proud of this progression of the Data Warehouse too.
It might look suspicious none of these guys stayed long. Camilla is still in the company, but now responsible for BI, and that is different from being a BI developer. They all being young and talented, it is just natural they move around. All of them have made significant contributions to the Data Warehouse.  I’m sure there will be a new brilliant member of the BI team replacing Petter, working together with Camilla and Henrik making the Data Warehouse better.


The graph now with Qlikview activities

The Data Warehouse monthly twitter graph is now enhanced with Qlikview user activities.

As you probably know by now the Data Warehouse is a Business Intelligence data storage system, it is GUI or viewer agnostic, viewers of users choice are welcome. MS Excel is the most popular one, but more and more users prefer Qlikview apps.
This monthly twitter graph is an attempt by me to show usage of the Data Warehouse and most usage is covered is the mQuery (millions of SQL queries, the lime-green line), But you do not see the Qlikview user activities, since Qlikview has it’s own proprietary storage. Data Warehouse information is exported to Qlikview,from there you have to measure activities via Qlikview. This is maybe thebiggest drawback of Qlikview, it is a closed environment, once the data is imported to Qlikview you can only access the data via Qlikview. You should not import the data directly from source systems into Qlikview, but use an open storage in between if you like to use other viewers together with Qlikview.  
I have used the calls figures from the Qlikview session logs, to illustrate user activities. The dotted dark yellow line represents Qlikview calls in the tens of thousands. If this is good figure to show Qlikview user activities I really do not know, if you have opinions on this please tell me.
You follow the Data Warehouse tweets here.