2012-07-24

My summer of code - 2

Just south of Sète, France

Each summer vacation if I do not have anything better to do, I try study a subject in some detail, something I can use in my daily  work. I wrote a post about   my Summer of Code . This year I decided to study computer language  ‘D’ , but unfortunately I found better things to do. A summer vacation which I have blogged about   here ,   here  and   here . Back at the office I had to clean up the mess I thought I had created during the vacation, fortunately this time  I was not to blame . Then I have to replace some   hardware  and create a   data warehouse for Fuji Air Tools factory .  

I also have to do some web programming, we may need an Excel sheet like  grid application with the capability to communicate bi-directionally with SAP. If you have followed my blogging you know I’m not a good  experienced web programmer. I have with the help of my friend Andreas  decided to learn the   Sencha/ExtJS  framework to jumpstart my web capabilities.

Up in the mountains, Axel and Erik. Down behind is a pit hundreds of meters deep.

So much for my Summer of Code 2012. I have to postpone ‘D’. When I have time for ‘D’ then I probably will choose   Perl6 , it looks like Perl6 begins to be ready for production in some not too distant future. At the moment I doubt I will have time for Sencha/ExtJS, but we’ll see.

I pray next year I have time for ‘D’ and Perl6, or better the year after. I like to go down to France next year again. I’m beginning to be a francophile like my son Axel. This country has it all from skiing to beach life. And so much more. I have some colleagues in Nantes, maybe I should ask if they have some work for me.

2012-07-21

Aftermath of vacation

During   my vacation , our Business intelligence system was hit by two severe crashes, the first one   currency rate load failure  I have blogged about, but we also had failed to load project costs from SAP. Two days before my vacation I enhanced the   job iterator  functionality in my job scheduling system.  While trying to debug the currency rate problem on the German Autobahn on my way to France, I got a call from the office telling me there is also a problem with project costs. From the info I got I guessed it must have something to do with job iterators, the costs where to high 17 times to be exact. And the job that failed had a job iterator with 17 rows. I could not for my life figure out what the problem was and I had done extensive testing of my changes, but  if you finds bugs look at the last changes and 17 was also pointing  at my job iterator, so still on the road I backed out my changes, but the cost problem remained.
Back at the office we found out  the problem had started before I made the changes, it was another job that was the culprit and not all projects were affected. Since all updates are done with SQL we suspected unwanted duplicate rows in some involved table. During my absence two colleagues had analyzed the problem and they had isolated the problem to SAP table EKPO (purchasing document items). So we checked our ‘copy’ of the EKPO table and found no indexes. EBELN (Purchasing Document) and EBELP (PD item) should be primary index in our EKPO table. By the query select count(*),EBELN, EBELP from EKPO group by EBELN, EBELP having count(*) > 1 , we found SEVENTEEN  invalid rows with blank EBELN, EBELP (these rows were inserted into SAP a few days before my vacation). We removed the offending lines, inserted the primary index and reloaded the costs, now right again to the last cent.
This crash could have been avoided if we have had proper input checks and correct indexes on the database table but we do make mistakes, ‘those who do not do mistakes tend not to do anything at all’. And as always, shit happens at the worst possible time. Murphy rules.
   
One interesting observation, the execution time for the failing SQL increased from 2 minutes to over 8 hours with these 17 bad EKPO rows! I end this post with the failing SQL.
 
INSERT INTO WBS_TABLEB_temp1  (
                            SELECT
                            'WBS', '1', A.PSPID, A.POST1, B.POSID, C.PERIO,
                            CASE
WHEN LEFT(CONCAT(substring(B.POSID,9),' ','/',B.POST1),2) = ' /' THEN 'Internal Order & Unassigned'
WHEN LEFT(CONCAT(substring(B.POSID,9),' ','/',B.POST1),5) IN ('03-01','03-02') THEN CONCAT(substring(B.POSID,9),' ','/',B.POST1,' ','Lab')
                                   ELSE CONCAT(substring(B.POSID,9),' ','/',B.POST1)
                            END AS SufixName,
                            CAST(I.COST_ELEMENT AS SIGNED),
                            COALESCE(I.DESCR_LONG, C.KSTAR, 'NO NUMBER EXISTS') AS AccountDescription,
                            C.WOGBTR,  C.MATNR,
                            COALESCE(E.MAKTX, C.MATNR, 'NO NUMBER EXISTS') AS MATDESCR,
                            C.EBELN,
                            COALESCE(F.TXZ01, C.EBELN, 'NO NUMBER EXISTS') AS PurchOrderText,
                            COALESCE(H.SNAME, C.PERNR, 'NO NUMBER EXISTS') AS Person,
                            C.PERNR, C.MBGBTR, C.MEINH, A.PRCTR, C.FKBER,
                            CONCAT(RIGHT(CAST(C.GJAHR AS CHAR),2),LPAD(CAST(C.Perio AS CHAR), 2, '0')) AS Period,
                            C.BUKRS, B.PSPNR, C.GJAHR, B.FKSTL, C.REFBN, C.BUDAT, A.VERNA, A.ASTNA, A.PROFL,
                            COALESCE(L.TypeDescr, L.TypeNr, 'NO NUMBER EXISTS') AS ProfileDescr,
                            B.USR00, C.PAROB, C.PAROB1, C.USPOB, I.CON_AREA,
                            I.SETNAME AS SetName, I.SUBSETNAME as SubSetName,
                            I.SUBSETDESCR as AccountGroup,I.DESCR_LONG AS AccountConcat,
                            C.PAROB, C.CPUDT, C.BLDAT, C.BLTXT, C.BELNR, C.BLART, C.SGTXT,C.GKONT, C.USNAM,
                            C.WSDAT,C.PFKBER, C.PSCOPE,C.WERKS,C.BUDAT, C.AWTYP,C.AWREF_REV,
                            C.CPUTM,M.CRI,M.DOC_ID, H.KOSTL,
                            CASE   WHEN C.REFBN IS NULL OR M.CRI IS NULL OR M.REF_DOC_NR IS NULL
                                   THEN 'No' ELSE 'Yes'
                            END AS PDF_Document
                            FROM ACTAPRJ.PROJ A
                            RIGHT JOIN ACTAPRJ.PRPS B
                            ON A.PSPNR = B.PSPHI
                            LEFT JOIN ACTAPRJ.COVP C
                            ON B.OBJNR = C.OBJNR
                            LEFT JOIN ACTAPRJ.CSKU D
                            ON C.KSTAR = D.KSTAR
                            AND D.SPRAS ='E'
                            LEFT JOIN ACTAPRJ.MAKT E
                                ON C.MATNR = E.MATNR
                            LEFT JOIN ACTAPRJ.EKPO F
                            ON C.EBELN = F.EBELN
                            AND C.EBELP = F.EBELP
                            LEFT JOIN ACTAPRJ.PA0001 H
                            ON C.PERNR = H.PERNR
                            AND H.ENDDA > NOW()
                            LEFT JOIN ACTAPRJ.DB3FTM_SET_STRUCTURE_PS I
                            ON C.KSTAR = I.COST_ELEMENT
                            LEFT JOIN ACTAPRJ.PROJECT_TYPE L
                            ON RIGHT(A.PROFL,2) = L.ID
                            LEFT JOIN ACTADW.DB3FTM_EXTERNAL M
                            ON C.REFBN = M.REF_DOC_NR
                            AND M.REF_DOC_NR <> ''
                            WHERE C.KSTAR NOT IN ('9999210000','9999210003','9999210004','0009004902')
                            )
                            UNION ALL
                            ( SELECT
                            'IO', '1', A.PSPID, A.POST1,
                            COALESCE(RIGHT(E.AUFNR,10), E.PSPEL, 'NO NUMBER EXISTS') AS OBJECT,
                            C.PERIO,
                            CASE
WHEN LEFT(CONCAT(substring(B.POSID,9),' ','/',B.POST1),2) = ' /' THEN 'Internal Order & Unassigned'
                                           ELSE CONCAT(substring(B.POSID,9),' ','/',B.POST1)
                            END AS SufixName,
                            CAST(I.COST_ELEMENT AS SIGNED),
                            COALESCE(I.DESCR_LONG, C.KSTAR, 'NO NUMBER EXISTS') AS AccountDescription,
                            C.WOGBTR, C.MATNR,
                            COALESCE(K.MAKTX, C.MATNR, 'NO NUMBER EXISTS') AS MATDESCR,
                            C.EBELN,
                            COALESCE(F.TXZ01, C.EBELN, 'NO NUMBER EXISTS') AS PurchOrderText,
                            COALESCE(H.SNAME, C.PERNR, 'NO NUMBER EXISTS') AS Person,
                            C.PERNR, C.MBGBTR,C.MEINH, A.PRCTR, C.FKBER,
                            CONCAT(RIGHT(CAST(C.GJAHR AS CHAR),2),LPAD(CAST(C.Perio AS CHAR), 2, '0')) AS Period,
                            C.BUKRS,B.PSPNR,C.GJAHR,B.FKSTL,C.REFBN,C.BUDAT, A.VERNA, A.ASTNA, A.PROFL,
                            COALESCE(L.TypeDescr, L.TypeNr, 'NO NUMBER EXISTS') AS ProfileDescr,
                            B.USR00,C.PAROB,C.PAROB1,C.USPOB, I.CON_AREA,
                            I.SETNAME AS SetName,I.SUBSETNAME as SubSetName,
                            I.SUBSETDESCR as AccountGroup,I.DESCR_LONG AS AccountConcat,
                            C.PAROB, C.CPUDT,C.BLDAT,C.BLTXT,C.BELNR, C.BLART, C.SGTXT, C.GKONT,
                            C.USNAM, C.WSDAT,C.PFKBER, C.PSCOPE,C.WERKS,C.BUDAT, C.AWTYP,
                            C.AWREF_REV, C.CPUTM, M.CRI,M.DOC_ID,H.KOSTL,
                            CASE
                                     WHEN C.REFBN IS NULL OR M.CRI IS NULL OR M.REF_DOC_NR IS NULL
                                     THEN 'No' ELSE 'Yes'
                            END AS PDF_Document
                            FROM ACTAPRJ.PROJ A
                            RIGHT JOIN ACTAPRJ.PRPS B
                            ON A.PSPNR = B.PSPHI
                            RIGHT JOIN ACTAPRJ.AUFK E
                            ON B.PSPNR = E.PSPEL
                            RIGHT JOIN ACTAPRJ.COVP C
                            ON E.OBJNR = C.OBJNR
                            LEFT JOIN ACTAPRJ.CSKU D
                            ON C.KSTAR = D.KSTAR
                            AND D.SPRAS ='E'
                            LEFT JOIN ACTAPRJ.MAKT K
                            ON C.MATNR = K.MATNR
                            LEFT JOIN ACTAPRJ.EKPO F
                            ON C.EBELN = F.EBELN
                            AND C.EBELP = F.EBELP
                            LEFT JOIN ACTAPRJ.PA0001 H
                            ON C.PERNR = H.PERNR
                            AND H.ENDDA > NOW()
                            LEFT JOIN ACTAPRJ.DB3FTM_SET_STRUCTURE_PS I
                            ON C.KSTAR = I.COST_ELEMENT
                            LEFT JOIN ACTAPRJ.PROJECT_TYPE L
                            ON RIGHT(A.PROFL,2) = L.ID
                            LEFT JOIN ACTADW.DB3FTM_EXTERNAL M
                            ON C.REFBN = M.REF_DOC_NR
                            AND M.REF_DOC_NR <> ''
                            WHERE C.KSTAR NOT IN ('9999210000','9999210003','9999210004','0009004902') AND
.PSPID != ' ‘ );

2012-07-17

Annoying statistics

I have been looking at my blog statistics and found a most annoying thing. In my series Job Scheduling with PHP  almost no one  reads  the last and most interesting post Job Scheduling with PHP - 4. The obvious interpretation is you my dear reader do not find the first three parts interesting enough to read the final post. In future I have to improve my writing or write about more interesting stuff. But that is what my blogging exercise is about learn to be a better blogger.

Family blogging


Some days ago I learned I’m not the only one in the family with a blog,  I got a blogging niece Julia. My son told me I figured in a post as ‘her senile uncle that knows it all’. I tried to resist the temptation to look but this morning I tried to find the post. I could not find that post but I found another interesting post, she mention she is hours too early at airports, as there always something going wrong when she or her mother is flying. She thinks this is in her genes. Read the last paragraph of this post. It’s in the family genome. I’m a bit surprised as her mother (my sister) is a well-structured headmaster, I always thought my airport problems  was due to I’m a bit offhand and absent-minded. It feels better knowing it’s in the genes, there is nothing I can do about it.

As Julia’s uncle I have to comment her blog. It is not what I expected. Is this really my niece who graduated from college with absolute top scores? I knew she works as a part time paparazzi and likes to party. But this looks more than part time. And when is she going to start studying at KTH? I’m sure Julia will do fine in future too, she is not only smart, she is also very tough and competitive.

 I suspect Julia is a more successful blogger than I am. I thought I had 1000 visits a week ago but that was not so, I have had 1000 page views since the start. Julia probably have more page views in a day. At next family gathering I should ask her for advice, how to make my blog more attractive.

2012-07-16

Always online

Always online is a reality, we were four people on the road with two cellphones, two smartphones, two Ipads, three PC notebooks, one 3G modem, one Apple notebook and an Amazon Kindle. We had constant access to the Internet, friends and work during the entire trip. All of us liked it, (for those who do not like this new connected lifestyle there is an off-button on all devices). Being hooked up to the web simplifies a lot of things when you are on vacation like ordering car spare parts or finding hotel for the night along the way.
In 1995 I was together with six others on a boat outside Stockholm all with cellphones at that time we realized in future everyone would have a cellphone and be accessible 24X7. But that four people on a vacation  road trip 2012 should have two cell phones, five Internet connected devices plus a tracking device with a complete European map and some extra computers, was beyond our imagination.
I’m home again after an unusual  vacation. A road trip together with my two sons and the oldest son’s girlfriend, for two weeks it’s sounds like a recipe for a   Norén  family drama.(As you may deduct from the picture of Norén he is not in for comedy, actually I do not like his public image or his plays.) But the four of us got along just fine. I only flipped out once or twice driving in towns. I hate driving in towns I do not know, my sense of directions is not bad, it’s nil . (According to my sons, my world is a Rubrik’s cube constantly twisting.) We had bought a GPS navigator, Navigon 72 Easy  for this trip, this little wonder box proved to be invaluable to find the way in Central Europe, but it had its flaws, it hung when needed the most inside towns, and it sometimes points out odd routes in towns, but on the whole it was great guide. But had I known what I know now, I had spent another 100€ on a better more online navigator.  This navigator has a peculiar habit of jumping between roads, e.g. I had been northbound on Autobahn A1 for some 50 km when the navigator  jump over to a parallel country road and suggested I should make a U turn if possible. During this trip I have been following   Carl Mäsak’s  posts on making an adventure game in Perl6 , there you are only allowed to walk between rooms if there is a crossing, this logic is missing in my navigator.
It took us four days to reach our final destination Gignac in France. On the way south we stopped in København in Denmark,  Dijon and Montpellier in France, three very nice towns you should visit if you have the chance. We hit the Hamburg evening rush hours and that is something you should avoid, apart from that the trip went smoothly. We had to change a wheel bearing in Malmö. Just outside Stockholm we noticed a malicious sound from left the rear wheel,  it had to be the wheel bearing so we called a   garage in Malmö  and asked them to be prepared to replace the bearing when we arrived there. While we had lunch they repaired the car and an hour later we drove on to France. My car a Volvo V50 performed great, in Germany we found out 170km/h was optimum cruising speed. With modern IT gear you can do a lot. With some Ipads and smartphones it was easy to find a garage and without even slowing down arrange a repair. Later during the trip we reserved hotel rooms via hotels.com using the same IT gear. I also invented   speed-debugging  with my PC and a 3G modem while my son tested optimal cruising speed. We had a very pleasant stay in Gignac . The landscape up in the mountains is breathtaking, and the Mediterranean beaches south of Sète  are fantastic. Not to mention the food and wine, we loaded the car with bottles from the wine cellar of   Dumas Gassac .
We took the return trip in two legs. First to Köln in Germany and then home. We stayed for three nights in Köln, a very nice City, for those of you who have not seen the Cathedral the   Kölner Dom  I can tell you it’s stunningly huge and beautiful. We visited some great art museums contemporary and old/ancient. We spent a fortune on shopping.  The youngsters had one great night on the town, they came back to the hotel at about 05:00. The second leg Köln-Stockholm was done in only fourteen hours. And tomorrow I be back at the office .
On next year's summer vacation I will only have one 4G smartphone and my Kindle. Electronic Ink  screens are superior to LED screens, I’m eagerly waiting for Color Electronic Ink screens in other pads, PC and phones. Today we can replace the PC with pads and smartphones, but for reading books we need better screens than LED, Electronic Inc or similar techniques are the next screen evolution. Something that emulate paper!  

2012-07-10

Business Intelligence and Hardware

Hardware is often neglected in applications design.  In best cases you divide components of an application into separate servers, move the data to a separate SAN and add some extra RAM for performance and that’s it. The server infrastructure of applications is often well suited for transactional systems. Transactional systems do lot of random read/write of small chunks of data and very little processing on those small chunks.Business Intelligence systems on the other hand does not write very much, but reads a lot. Both reads and writes are mostly done in large chunks.  

 

Since the middle of the nineties I have been interested in BI applications hardware infrastructure. My interest for hardware started with the   spintronic  revolution, I realized that hard disks (HDD) and memory RAM was going to be larger, faster and cheaper in future. HDD was the first in the spintronic wave. HDD had been too small, too slow and too expensive to allow for modern BI; with the SATA HDD we had cheap, fast and large HDD. The seek time (find the data to read or write) on these cheap SATA HDD is not impressive, so if you need to do lots of small random read/write it is not a wise choice. So the traditional servers still use expensive and small HDD with good seek time.  These HDD are not very good for BI. BI are more interested in low transfer time (from disc into memory) than low seek time, and the SATA protocol can deliver more data than the discs can spin. More important SATA disks today are large, you can get 3 TB SATA disks for about 200€ and size matters for BI operations.  SATA HDD are better suited for BI than more expensive and smaller HDD  found in servers. Of course we will use Solid State Disks in a few years’ time, but still these disks are too expensive and too small for my liking .

Lots of RAM more than compensate for the slower  SATA disks (compared with SAS and SSD),  today you get 24GB SSD3 for about 300€. It is important to have enough of RAM to keep the active data set in memory to keep physical I/O low. Our BI database server needs 16GB to perform well. First day of the month I have noticed increased response times and some ‘peculiar’ MySQL behavior. I will install more memory and see if that helps. If not I have to find the root cause for the increased response times which most likely is missing or not good enough indexes. I still see recommendation about being careful with adding indexes. For BI systems this is wrong, wrong, wrong. You should sprinkle your database with indexes. All frequent queries should have optimized indexes. For the experienced DBA I recommend   Relational Database Index Design and the Optimizers’ by Tapio Lahdenmaki and Mike Leach . This is serious, heavy and good reading about indexes. But I first try to mend performance problems with hardware, it is simpler to add RAM, than to analyze bottlenecks.

Processors today are so powerful any modern multicore processor will do just fine. I use high quality workstation motherboards. For my last database server I used an ASUS P9X79 DELUXE X79 S-2011 ATX motherboard   for about 300€. It performs beautifully for our Business Intelligence system.

With such components it is an easy task to build high performance BI servers. I build the servers as simple as possible, this means I deliberately build them as single-point-of-failure. Simple server means few things that can crash and my servers are remarkably stable. The only things that have crashed so far are HDD raids and raid controllers (two times in eight years). Today disks are so big so we do not need to raid them anymore. Hardware development goes very fast; last year’s top notch hardware is ready for the scrap heap the next year. Using inexpensive servers give me the luxury to replace them more often. The normal lifetime for a server is three years and the life time cycle is test-production-backup-scrap heap. The database server I try to replace more often.

I’m aware of most experts do not approve of my ideas. But I created a working system after these ideas and it performs beautifully. I have put lots of efforts into my system; thinking, testing, measuring. More important than hardware is the database design. I have completely removed the traditional snowflake or whatever it is called design. The traditional BI database design patterns were conceived when hardware were expensive and disks were small. With today’s cheap hardware the old design patterns are a millstone around the BI system’s neck, we will see new simpler databases with more redundancies as in-memory computing becomes a reality. I should probably post about this, since I am an humble pioneer in this field.  

This is how my ‘tin cans’ look 2012-12-23.

2012-07-08

One thousand visits

Pont de Diable, Hérault river.
When I started blogging  it was just to get firsthand experience of blogging and I decided to write about my professional life only. I have avoided meta blogging, i.e. write about the fact I’m blogging. But now I got 1000 visits on my blog and I think that deserves a post of its own. When I started I really didn’t expect anything I just wrote some posts and that were more or less it. But very early on I became obsessed with the statistics Google served me. Who reads what posts, when and from where. To me one thousand visits  from all over the globe is big. I’m not writing about popular topics and I have not done much to make my blog stand out. And I also realized how hard it is to write good coherent posts, especially in an language which I do not fully master. It is two very different things read and write. I read English more or less fluently, but I do not write nearly as good. My vocabulary is IT and a bit academic and I miss the nuances of words. But I do think my posts are written well enough to be understood. And I find it fun to write these posts. Blogging is a very good experience. I also got one comment for which  I’m very happy, proud and thankful. I also got some comments from people who want to promote commercial software. I have decided not to publish such comments, since I cannot take any responsibility for those products. Free software would be something else though.
I also realized you can use a blog as a documentation system for your software, overviews of software makes good posts. I have referred to some of my posts in my daily work. I can well imagine a good blog is a very effective way for consultants to market themselves. When I hire a consultant first thing I do is to search the internet. What do my own blog tell about me? I have not given that any thoughts, my goal was not to market myself, if that had been the case I would have structured the blog differently and be much more careful with what and how I write. This is also a learning experience, you need to practice blogging to be good at it. The topic I ‘ve chosen for my blog ‘software development’ is not the simplest thing to write about. You need to publish good examples, correct measurements etc. and great attention to details. If I wrote post to market myself I would have put much more efforts to the examples, readability and layout. And there are some posts that not should have been published.
Will I proceed to post here? I think so, but probably less frequent and I do not intend this blog to stay for eternality. I may start other  blogs writing, on other topics, I have other interests beside IT, science, history are things I can imagine myself to write about. People have told me I’m funny in a positive way, in Swedish I can write things people laugh at (in a positive way). It would be a great challenge to try to write humorous posts in English. I have an eye for petty everyday things that go wrong. And I can write about renovating kitchen when everything goes wrong.
I write this post from a small southern French town Gignac just west of Montpellier. I’m not much for traveling these days, my sons persuaded me to come along and we went by car, a bit too long it’s about 2300 km from Stockholm. It is very nice here, not so much tourists. We swim in the river Hérault where the local youngsters jump from the 18 meters high medieval bridge Pont de Diable (the devil's bridge), I saw some guys jumping off a modern 25 meter bridge.  This is dangerous some die, I been told you can get caught in a powerful  undertow, but I suspect the jump in itself can be mortal it looks scary, I would never do it. But I got a friend who would and he would add one or two somersaults just for fun. Today we will do some sightseeing, there is an authentic crusaders village up in the mountains we all like to see. We are a bit sunburnt and should stay away from the sun, so sightseeing is perfect today.
I have noticed it is popular among blogging IT guys to write about local beers  when they travel. This is something I cannot do. I do not drink beer anymore I suspect I get migraine from it. I do not drink red wine or colored liquor either.  It’s the same thing migraine. I only drink Gin, Vodka, white wine including Champagne. It is a tough life I live.

2012-07-02

PHP, CURL and Murphy

I have written a nice automatic PHP Curl procedure to scrape currency rates from an internal  web page. It works beautifully, and has done so since a year. Now I’m on a road trip to southern France , and this morning I got an emergency alert from the routine ‘URL inaccessible’. WTF has happened? It took me some minutes to realize the web server is not responding. I missed this in the design, what to do if this web server is down. It is a serious error in financial reporting not to have valid currency rates, so the entire reporting system/Business Intelligence is stopped! But I didn’t worried too much, I alerted the operations and hoped they could restart this server in minutes. I called the operations early morning from Wiesbaden in Germany they restarted the web server and I got a new error msg from my routine ‘the URL cannot be found’. I misinterpreted the error msg. I still thought there was a problem with the server. I find it hard to do debugging in 200 km/h, just being able to connect over a 3G modem on Autobahn is amazing. I think I have a record in speed-debugging. Later in an hotel room I found out the URL in the webpage was changed and the structure of the webpage was also changed. I realized I would not be able to fix this on my vacation. I discussed the situation with the only remaining colleague at the office (this is vacation period in Sweden). We decided the best thing to do was copy the previous month currency rates, and correct the figure after my vacation.
If I had done a proper risk analysis up front this situation would probably not have happened. But if you always do everything according to the book, not much is done. It is a fine balance how much effort should be spent on design up front, if the resources are scarce you have to compromise. And to be honest I seldom overdo the design work, I often start the build phase a bit premature, most of the time it works well but this time it came back and bit me, at the most inconvenient time. Murphy’s law applies as usual.  I will redesign the system so it always works with the most appropriate currency rates entered (the latest), and some self-correcting mechanism when the correct rates are entered. It shouldn’t be that hard to do.
I end this with the Curl PHP code. This is my first and only piece of Curl code, so I have used examples found on the web, some of the code I do not understand well myself,
but it works beautifully when the web server is up, and when it’s down! I got an alert telling me the servers down, what more can you expect.
#!/home/tooljn/PHP5.3/usr/local/bin/php
<?php
/**
* @package dummy
* @ignore
*/
/**
* This code scrapes
a webpage and downloads an excel file linked from the webpage
* //$link = $webhost.'/'.'Rates)';
* @return bool TRUE
if all is hunky dory
*/
$scriptName = $job['pgm'];
$scriptParm = $job['parm'];
$curlAgent = $context['_xlate']['C_application'];
$eop =$job['eop'][0]['value'];
if (!$eop) $eop=15;
//default
$period =date('Y-m',mktime(0, 0, 0, date("m"), date("d")-$eop,
date("Y")));
$webhost =$job['webhost'][0]['value'];
$weburl =$job['weburl'][0]['value'];
$log->enter('Info',"Enter
script=$scriptName, job=$jobName, Curl agent=$curlAgent, period=$period");
$link =$webhost.'/'.$weburl;
$result =get_web_page($link,$curlAgent);
file_put_contents('WEBPAGE.txt',$result);
$excelname ='EXC'.$period.' Published';
$link =get_excel_link($excelname,$result,$period);
if ($link == NULL) {
   $log->enter('Error',"No
currency rate Excel link for period=$period, end of period=$eop");
                $_RESULT = FALSE;
                return FALSE;
}
$link =$webhost.'/'.$link;
$sufix =substr($link,strrpos($link,'.'));
$excelfilename ='excel'.$sufix;
get_web_file($link,$excelfilename,$curlAgent);
file_put_contents('EXCELFILE',serialize(array(
                array('EXCEL' => $excelname, 'WORKBOOK' => $excelfilename,'PERIOD'=> $period, 'LINK'=>$link ))));
$_RESULT = TRUE;
// indicate successful execution
return(TRUE);
/**
* This function
extracts the latest link to monthly Exchange rates Excel workbook
*/
function
get_excel_link($excelname,$htmltxt,$period){
                $linktab = get_link_tab($htmltxt);
                return $linktab["$excelname"];
}
/**
* This function
extracts html href and creates an associative array Excelname => Url
*/
function
get_link_tab($htmltxt){
                $exceltab = array();
                $regexp2 = "alt=\"[^\"]*";
                $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
   if(preg_match_all("/$regexp/siU",
$htmltxt, $matches, PREG_SET_ORDER)) {
            foreach($matches
as $match) {
                            preg_match("/$regexp2/",
$match[3], $matches2);
                            //
$match[2] = link address
                            //
$match[3] = link text
                            $excelna
= explode('.',(substr($matches2[0],5)));
                            $exceltab[$excelna[0]]
= $match[2];
            }
                }
//
var_dump($exceltab);
                return $exceltab;
}
/**
* Get a web file
(HTML, XHTML, XML, image, etc.) from a URL.  Return an
* array containing
the HTTP server response header fields and content.
*/
function
get_web_file($url,$fileName,$curlAgent){
//print "get_web_file $url,$fileName,$curlAgent \n";
                $fp = fopen ("$fileName", 'w+');
                $ch = curl_init($url);
                curl_setopt($ch, CURLOPT_TIMEOUT, 50);
                curl_setopt($ch, CURLOPT_FILE, $fp);
                curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// curl_setopt($ch,CURLOPT_USERAGENT, "$curlAgent");
                curl_exec($ch);
                curl_close($ch);
                fclose($fp);
}
/**
* Get a web page from
a URL.  Return an array with the contents of the array
*/
function
get_web_page($url,$curlAgent){
                $options = array(
                CURLOPT_RETURNTRANSFER
=> true,     // return web page
                CURLOPT_HEADER
                   => false,              // don't return headers
                CURLOPT_FOLLOWLOCATION
=> true,                //
follow redirects
                CURLOPT_ENCODING
               =>
"",                   // handle all encodings
                CURLOPT_USERAGENT
            =>
"$curlAgent", // who am i
                CURLOPT_AUTOREFERER          =>
true,                //
set referer on redirect
                CURLOPT_CONNECTTIMEOUT
=> 120,                //
timeout on connect
                CURLOPT_TIMEOUT
   =>
120,                //
timeout on response
                CURLOPT_MAXREDIRS
             =>
10,                  //
stop after 10 redirects
            CURLOPT_POSTFIELDS
            =>
"",
            CURLOPT_VERBOSE
                        => false,
            );
                $ch           = curl_init($url);
                curl_setopt_array($ch, $options);
                $content = curl_exec($ch);
                $err          = curl_errno($ch);
                $errmsg  = curl_error($ch);
                $header  = curl_getinfo($ch);
                curl_close( $ch );
                $header['errno']   = $err;
                $header['errmsg']  = $errmsg;
                $header['content'] = $content;
//var_dump($header);
                return $content;
}