SharePoint lists & complex views

How to make SharePoint bugs work to your advantage.

I do not like to trick software to do what I want or create complex solutions, but sometimes there is no other way. This time I had to both trick and be somewhat complex. We hit one SharePoint pain point, the dreaded 5000 limit, no matter what the SP sales guys say you have a situation when your list goes above 5000 items. I have come across persons advocating the 5000 limit is good!

We have a list with historical, current and future items. The list have a ValidFrom and a ValidTo date items. Each change in an item closes the current item and creates a new one with a blank ValidTo. Not only does this in itself create many new items, it makes it super complex to create SP “time slot views” on the list, as you must add an OR condition; ValidTo equals space ([ValidTo]=""). We now have +5000 current items, we need to divide them into smaller views so we can access them. In a programming language you express this with a boolean condition like:
FieldA=’X’ and ValidFrom >= [Today] and (ValidTo<[Today] or ValidTo=” “)

Note the parenthesis they determine the order of evaluation, so you can with great precision tell the result of an expression.
In user friendly sharePoint you point and click your view filter like:

After many mature words, trials and errors I came to the conclusion SP view filter evaluates from top with next expression below, one expression at the time. The filter above which was used in production is almost correct. This is how it should look like:

First thing we realised when we passed 5000 items, you cannot index a column since the list is too big and filter fields must be indexed if you have more than 5000 items in a list. I bet your SP sales representative didn’t tell you that. We deleted items and indexed the fields and then started to apply filters, but how much we tried we could not make any filter work 100% correct, I suspect there is a bug in SP showing when you have supermany items in a list (+5000), but I’m not sure, I can have missed something when I tested the filter. Anyway the needed filters are beyond the obvious, during my failed attempts to create a valid filter I stumbled upon a post describing a clever trick. I had already tried to create a computed boolean field with this formula:

But SP told me I could not use [Today] in a computed field! Why, why in Gods name do anyone implement such a stupid limitation? At this point I started to feel it was impossible to to create a working filter on the list. But then I saw this trick, first create a field named Today, then create the computed field with the formula, now referencing a field named Today, and now comes the stunning final remove the field Today. I said to myself this can’t possibly work, first Today should be a reserved keyword, second SP should say ‘Today is ambiguous’ when I try to reference it and finally SP should refuse to remove the field Today as it is used in computed formulas.
But lo and behold, it was possible to create the field Today. However SP refused to create a computed field in a list with more than 5000 items. #¤%&@ SharePoint! We had to remove the excess items again then define the computed field:

Here you see the calculated ValidToOK field, based on other columns. Finally I removed the Today field. This worked and SharePoint still says it is based on other columns (i.e. SP field, column and field is used interchangeably in SP litterature, I suppose nobody cares). SP still thinks ValidToOK is based on other columns, but Today  now evaluates to  current date! This view filter not only works correct, it is clear and simple to understand:

It works on super large lists +5000 items too, Big Data here we come:)
I have lost the link to the post showing the 'Today trick' so I cannot credit the inventor.

The bad thing; I do not know how many bugs in SP exposed and used in this post, but there are quite a few. The really bad thing; this trick can cease to work with next SP update. I will cross that bridge when I come to it.


The dreaded 5000 limit

Today I hit the dreaded 5000 limit in SharePoint.
This view cannot be displayed because it exceeds the list view threshold (5000 items) enforced by the administrator.

To view items, try selecting another view or creating a new view. If you do not have sufficient permissions to create views for this list, ask your administrator to modify the view so that it conforms to the list view threshold.

Learn about creating views for large lists.

This means you can not access the list FULLSTOP.

Do not believe the experts when they tell you 5000 items is not a problem, that you can have 32 million
items in a list, that you just have to add a view filtering out less than 5000 items. They do not tell you upfront, you need to have an index supporting the filter in place before you hit the 5000 limit. That they tell you when you have hit the ceiling. Filters can only be latched on indexed fields when the list is larger than 5000 items. And you cannot create an index on a list with more than 5000 items. Catch 22. A black hole storage. If I can't access the items I do not care if can store 32 zillon items in a list.
How can anyone the year 2016 have a product out on the market that cannot handle more than 5000 items. SharePoint is a work in progress and that is very diplomatic.
I will certainly publish a comment from a (proud) SharePoint developer designer defending the 5000 limit. Come on; tell us why 5000 items is a good limit for SharePoint in this day and age of IoT. Right now I can use some soothing words to lower my blood pressure down.

I Found the answer myself, this is reason for the 5000 limit

If you do not care or understand what you are writing this is probably a good answer.


Simple as possible but no simpler

The design goal for SharePoint is “make it as simple as possible to create a web site”. A web site for a project or a team that type of a web site that is. And SharePoint works well for that purpose, most project leaders can with very little training create a project site, where all documents, conversations, plans schedules etc. can be shared within the project team. The same goes for a team site. A team site just have more longevity. A simple isolated web site well integrated with MS office and outlook that is what you easily can create with SharePoint. And that is great, it is what most teams and project needs, common documents, calendar etc. Using defaults and out-of-the-box solutions, you can create a simple web-site for collaboration in no time. The problem with SharePoint starts when the team/project leader begins to enhance the site with more advanced functionality SharePoint was not designed for. E.g. SharePoint store data in lists, it is easy and works well to extract values from other list via lookup fields, any other interaction between lists works less well. If you use values from another table for a pull down menu, you may see timing problems, since a value in a list is not visible before some cyclic background process is executed, in a real database the values are visible to all the split second you commit you update transaction, in SharePoint you have to wait. When I talk to SharePoint specialists they tell me this can be fixed. When I ask why it is not fixed from the beginning I get the answer “it’s more complicated”.
- “Well can you fix it for me?”
- “No I can’t I’m not a developer.”
This is something I heard more than once, it seems SharePoint specialist are specialists in tweaking SharePoint but no programmers. Nothing wrong with that, the problem is you need programming skills to anything more than SharePoint is designed for. And then you need to consult a consultant with SharePoint programming skills. And now SharePoint is not so simple anymore. E.g. When you access a list’s fields in programs you use the internal name, this is the hidden name a field gets when it is created, when “the sharePoint developer” rename a field, he/she is just changing the “display alias” not the field name, but you do not see this, you think you rename the field and there is no simple way to retrieve a list of a list’s internal name, this makes it unnecessary awkward to develop SP programs. When SP was designed the internal name must have been regarded as ‘too complex, hide it’. However in the CSOM SharePoint framework it is not hidden, there you have to use the internal name. Internal name is not the only design flaw. This list is long, I’m just to inexperienced to naming them all, but ISO is another. US is not only the United States of America, it is America and the World, who has ever ever heard about ISO? If you want a default compliance with e.g. date ISO 1806 format, you have to change US default locale to some obscure nationality like ‘swedish’, not really what you want. It can sure be fixed, but probably complicated (see above). US standards, a nation who still use Henry the VIII thumb to measure distance. Don’t get me wrong I like US, especially New England if I could I would settle down in Massachusetts or New Hampshire, but US and international standards do not go well together, so the creator of a SharePoint site I know chose locale ‘swedish’ and built an ‘advanced’ app i.e. using sophisticated features like  indexes, unique constraints, custom input validations. All that worked until an unfortunate script upgraded the template also by accident reverted the locale back to US. Not much worked afterwards, take field formulas as an example, they depends on the locale, change locale is a fast way to remove all input validations. Basically you kiss your ‘advanced’ application goodbye by changing the locale.
Changing a SharePoint site’s locale should not be allowed and it should be simple to print a list of field internal names, yes I know you can do the latter. Display the ‘all items’ view make the browser display the page source look for all ‘realFieldName’. But why do you want to see the real field name it s just a distraction only for the techie who likes to complicate simple tasks like creating a web site.



Recently I got a comment to the post PHP 5.6 SAPRFC & SAPNWRFC asking about PHP 7 and SAPRFC.dll. I had in mind to have a go at PHP 7 and SAPRFC Linux this summer. But my job got in the way. All of a sudden I became a SharePoint C# backend developer and the unhappy owner of a SharePoint app, this plus my ordinary job as a IT architect plus giving a BI team a hand now and then had been too much for me. You need focus time learning something as complicated as a software platform, a computer language and frameworks. Being interrupted many times every day with other matters like ‘It seems I am not authorized to update this product classification code, can you fix it now? We need this code for the book closing tomorrow’ made it impossible for me to have a look at SAPRFC and PHP 7 and (UTF-8). My plan B, convert my code using SAPNWRFC by Piers Harding, looks like a better solution. Today I use SAPNWRFC when I need UTF-8 capabilities, using just one SAP connector is a better solution but the conversion will also take time I do not have at the moment. I have not had a summer vacation yet, it looks like I can have a short one in October, for those of you who doesn’t know, October in Sweden is not a good place for outdoor summer activities. October here is nice but you need more cloths than swimming pants and you just don’t go swimming, not even in September. It looks like My PHP code will stay on PHP 5.6 until next year. And it's more likely I abandon SAPRFC in favor of SAPNWRFC, than I try to upgrade SAPRFC to support PHP 7.

It would be good if someone at SAP could give an advice how we should connect from PHP. I hope SAP supports those heroes who creates and maintains connectors to popular languages like PHP.


The Data Warehouse, the first 15 years

It was 15 years ago today I started up the Data Warehouse, and it has been in style ever since. If it was exactly today I do not know but this year is the 15th anniversary. In its first years it was not much more than some PHP basic scripts  and a MySQL database. But the success of the Data Warehouse and the steep growth of data imports, made it necessary to restructure the data flow in the Data Warehouse. 2004 I decided a more formal approach was needed. I wanted to build a workflow engine (a job scheduler) and a supporting language. The new version of the Data Warehouse was ready for production 2006. The following years some features were added. From 2009 very little if any development has been done. Still the Data Warehouse is alive and kicking remarkably well. There have been attempts to replace it with mainstream products on the market. These attempts have failed, mainly due to excessive complexity and slowness of the standard products.
Somewhat surprisingly costs have not been an issue. When I started 2000 I could not get funds for Business Intelligence activities at all in the company. I had to use scrapped PCs for servers and do coding myself and use free software. Much have happened since then, today BI activities are allowed to cost, management have realised the value of good BI and today the big guys are prepared to spend millions on financial analytics. Unfortunately you cannot say the same for what I call operational analytics, i.e. what goes on in the factories purchasing, logistics and production. But the workshop guys do not care much, they already have my low cost Data Warehouse. I was told some months ago, in one of the major factories the ERP system is used for registering only, all planning etc are done in the Data Warehouse + some surrounding apps. This is also something I heard lately, use ERP for registering facts and use BI system for all other activities as analysing,  planning, forecasting etc. This is not something I have predicted, i have always believed the ERP data store and the Data Warehouse will merge together. The reason for the increased use of the Data Warehouse is speed and costs! I’m told development of extra features in standard ERP systems, take forever and the cost is ridicules high.
Fifteen years is not much for an IT system and I expect the Data Warehouse will celebrate 20 years five years from now. What’s more remarkable about the Data Warehouse, it is barely maintained at all the last five to seven years, still it cracks on, now with a new BI crew. Take any of the standard BI system on the market five to seven years ago. Would you like to run any of those with a new almost untrained crew? I wouldn’t. Of course I’m not happy about all things in the Data Warehouse, but it is a well designed, stable and balanced Data Warehouse. ‘World class’ a senior consultant Business Analyst once told me, ‘it beats everything I worked with’. All in all I’m very happy with the Data Warehouse.


Sharepoint the OData interface

In my last post I wrote about my agonies when I tried to extract data from a SharePoint list. Finally I put together an OData request that worked, but there were still one thing nagging me, in the request there was a ‘dynamic’ filter:
$filter=Year eq '2016' and Month eq 7

Since I want the last month’s currency rate figures I tried to create a view in SharePoint showing only the last month’s currency rates something similar to an SQL request like:
Create view last_period as
 select R1.*  from MY_EXCHANGE_RATE as R1
 (select * from MY_EXCHANGE_RATE
   group by year,month
   having year = max(year) and month = max(month)) as R2
on R1.year = R2.year and R1.month = R2.month

I knew this was not going to work with SharePoint lists, since this is pretty advanced and SharePoint lists are just lists plain and simple, these lists do not even qualify as a basic database manager, of course I was right, for now I’m stuck with my dynamic filter.
I struggled on with my Odata interface, in the last post I wrote the response from the Odata call was a shitload of complex XML:

This is just the very first part of the output, I knew even before I tried to parse it it was going to be a rough ride. This is a good example why XML has got such a bad reputation. Just because there is an option you do not necessarily have to use it, keep it simple stupid. I decided to go Powershell scripting, I have played around with it before and I liked it. My intention with this script was call Sharepoint to extract the monthly currency rates, then parse the response into a text file, and finally deliver the text file via FTP to a target application, in this case my Data Warehouse.
After the struggle of figure out how to set up the OData request ,the first part of my script call SharePoint Odata and extract the XML response is straightforward:
#Before we start, where are we?
$cwd = Get-Location
#write-host "$MyInvocation.ScriptName runs in $cwd"

write-host "Starting"
#Set up constants
$currsel     = 'Currency_x0020_Code/currencyName,Currency_x0020_Code/Title,Currency_x0020_Code/currencyNumCode

$rtesel  = 'Currency_x0020_CodeId,Year,Month,Value%5Fx0020%5FEnd%5Fx0020%5Fof%5Fx0020%5FP

$spsel       = '$select='+"$currsel,$rtesel"
$spexp       = '$expand=Currency_x0020_Code';
$splist      = 'MonthlyCurrencyRates'
$sproot      = 'https://thecompany/codes/_api/Web/Lists'
$tempfile    = 'ftp.txt';

$FTPserver   = "ftp://thedatawarehouse/"
$FTPmap      = "ftpload/indw/"
$FTPfileName = "monthlyCurrencyRates.txt"

$Username    = 'xxx'
$Password    = 'xxx'
$FTPuser     = 'xxx'
$FTPpw       = 'xxx'  

#Load the webinterface and fix credentials
#The credentials ares supposed to be a certificate but it is unclear to me how to use certificates
# in Powerhell scripting at this time. For now I use userid & password
$wc = New-Object system.Net.WebClient
$credentials = New-Object System.Management.Automation.PSCredential $Username, $Password
$wc.Credentials = $credentials

#Get Year and month for the web service
Get-Date -format "yyyy-mm-dd HH:mm";
$defaultValue = (get-date).year;
($defaultValue,(Read-Host "Press enter to accept [$($defaultValue)]")) -match '\S' |% {$yr = $_}
$defaultValue = (get-date).month;
($defaultValue,(Read-Host "Press enter to accept [$($defaultValue)]")) -match '\S' |% {$mo = $_}
write-host "Fetching currency rates for $yr - $mo"

#Set up the parameters for the web service
$uri = "$sproot/GetByTitle('$splist')/Items?" + "$spsel&$spexp" + '&$filter=' + "Year eq '" + $yr + "' and Month eq " + $mo;

write-host "Call SharePoint via the websevice"
$xml = [xml]$wc.downloadString("$uri");

I got substantial help setting up the authentication to SharePoint from a colleague, apart from that the first part was a walk in the park. I
Next thing parse the horrendous XML output proved to be trickier. I know this First I tried a number of googled solutions that promised  to deal with namespaces and other ‘fancy stuff’ in the OData XML, nothing worked, after many failed attempts I decided to try the built in XPATH capabilities of PowerShell. It turned out to be a very nice experience, simple straight to the point  and pragmatic, except for one little detail, that took me more that a day to figure out, in the much overly verbose response some of the columns were declared with both attributes and text/data.
 <d:Month m:type="Edm.Double">2</d:Month>

Whatever I tried I could not grab the text part, I only got the attribute. After many hours and many many foul words, I decided to step back and think. Over the years I have learned not to mix attribute and data in an XML tag, it always come back and bite you sooner rather than later. All of a sudden I remember I once had seen ‘#text’ in an article on XML parsing, maybe… And yes it worked adding an extra ‘#text’ node gave me the month node text. All of a sudden it was a breeze to skip all the bogus, just point to the data and get it
#For every currency extract the monthly figures
$op = "Code NumCode ACrpt Year Month val avg_val orig_val orig_avg_value currency name `n" #Header line
foreach($entry in $xml.feed.entry){
   $op += '"'   + $entry.link.inline.entry.content.properties.Title+ '"'
   $op += ',"' + $entry.link.inline.entry.content.properties.currencyNumCode.'#text' + '"'
   $op += ',"' + $entry.link.inline.entry.content.properties.currencyAC_Reporting + '"'
   $op += ',"' + $entry.content.properties.Year + '"'
   $op += ',"' + $entry.content.properties.Month.'#text' + '"'
   $op += ","  + $entry.content.properties.Value_x0020_End_x0020_of_x0020_P.'#text'
   $op += ","  + $entry.content.properties.Average_x0020_Value_x0020_End_x0.'#text'
   $op += ","  + $entry.content.properties.Value
   $op += ","  + $entry.content.properties.Average_x0020_Value
   $op += ',"' + $entry.link.inline.entry.content.properties.currencyName + '"'
   $op += "`n"
$op = $op.TrimEnd("`n") #Remove the last new line

I know the produced verbose XML is of standard XML formats, normally Atom feeds and that you with the help of support libraries do not need to bother about the details, but I could not make any of those work. And for me parsing the XML it would have been easier with something less verbose, the fancy Atom feed I had to deal with was just crappy noise surrounding the data.  

Now I wanted to save the data in a file before I FTP it away, of course I used the Out-File command to do that. It worked nicely, except for one little thing. When I FTP the file it was corrupted when it reached the target FTP server. After many futile attempts I recalled a problem I had a year ago. For some reason Microsoft software insist writing BOM markers in UTF-8 files, but Microsoft's software seldom handle these BOM markers well, to get rid of the unwanted BOM marker I replace Out-File command with [IO.File]::WriteAllLines
write-host "Write the result to file ($tempfile)"
$Localfile = "$cwd\$tempfile"
If (Test-Path $Localfile){
Remove-Item $Localfile
[IO.File]::WriteAllLines($Localfile, $op) # Note! No utf-8 BOM marker.
  # Out-File command plus System.Net.FtpWebRequest send a currupt file!

The rest, sending the data over the FTP server was easy-peasy:
$FileContent = gc -en byte $Localfile
$FileLength  = $FileContent.Length

#Send the result to the recipient via FTP
$FTPfile = $FTPserver + $FTPmap + $FTPfileName

#Create Request Object
$FTPRequest = [System.Net.FtpWebRequest]::Create("$FTPfile")
$FTPRequest.Method = [System.Net.WebRequestMethods+Ftp]::UploadFile
$FTPRequest.Credentials = new-object System.Net.NetworkCredential($FTPuser, $FTPpw)
$FTPRequest.UseBinary = $FALSE
$FTPRequest.ContentLength = $FileContent.Length
$Run = $FTPRequest.GetRequestStream()

write-host "Send $FileLength bytes from $Localfile to $FTPfile"
$Run.Write($FileContent, 0, $FileContent.Length)

#FTP Cleanup

write-host "Done"

Now all I had to - run the script:

And write a Data Warehouse import job:
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<schedule logmsg='Import currency rates into the Data Warehouse'>

   <job name='getCurrencyRates' input='monthlyCurrencyRates.txt' minrows='2' >
<!-- Bombs out if there is less than two rows in the input file, which means no data (1 header row) -->

 <tag name='RTETABLE' value='MY_EXCHANGE_RATE'/>
   USE test;
   drop table if exists `@RTETABLE`;
   CREATE TABLE if not exists `@RTETABLE` (
   `currcd` char(3) NOT NULL DEFAULT '',
   `currcdn` char(3) NOT NULL DEFAULT '',
   `rpt` char(1) NOT NULL DEFAULT '',
   `year` char(4) NOT NULL DEFAULT '',
   `month` char(2) NOT NULL DEFAULT '',
   `rpt_value` decimal(11,8) DEFAULT NULL,
   `rpt_value_avg` decimal(11,8) DEFAULT NULL,
   `rate` decimal(11,8) DEFAULT NULL,
   `rate_avg` decimal(11,8) DEFAULT NULL,
   `name` char(30) NOT NULL DEFAULT '',
   `periodstart` date NOT NULL,
   PRIMARY KEY (`currcd`,`year`,`month`)
   )  COMMENT='Monthly Exchange Rates';
   set periodstart = DATE(concat(year,'-',month,'-01'))

And run it:

Finally look at the glorious result:

The experience of writing a SharePoint OData interface was not entirely unpleasant, PowerShell scripting is fun. Even though I still not got the hang of it (and .Net), I can produce some pretty advanced stuff, at least if I compare with what I can do with BAT scripts and with BASH scripts too, but it is not a fair comparison, PowerShell script with .Net are so much more. I cannot say the same about SharePoint,  e.g. the lists sucks when you try to do anything more than the simplest of tasks with them. SharePoint design goal is only to do simple tasks in a simple way, but marketed as as a platform that can do more.
Unfortunately you have to make design decisions and compromises between functionality and user friendliness. The internal field name of SharePoint lists is a good example. The internal names are very important for a developer trying to do something beyond the simplest of tasks. As it is now you are encouraged to neglect internal names when you define your data.
SharePoint is so simple and intuitive anyone can create advanced applications”. This is just a marketing hoax, but more about this some other time, now I am happy with the working integration.

This is a 'no frills' integration, next is to make an OData client work, I mentioned in this post I failed to make such clients work, now is the time do a serious attempt to make one work.