Tuesday, April 15, 2014

Large Lists, BCS, Excel REST, JSOM, CSVs and Office 365 OH MY!

An interesting challenge came across my desk last week:
"How can we import a CSV into Office 365 and use that data to tag other items within the site.  The CSV currently has ~7600 rows and is expected to grow.  We'll also need to re-import the CSV on an ad-hoc basis when the data changes."
The last part of that was the real issue.  There was simply no easy way of doing that.  After trying a few things, falling flat on my face, I'm prepared to share my exploration into the different options.

Each have their own merits and pitfalls.  This post will examine each and try to shed some light on the pitfalls I've found using each of these.


Wiring up an ECT on Office 365 can be a little finicky. I initially had some issues due to 2 BDC Models that were created for the same ECT.  After calling in the eagle-eyed BCS guru, +Fabian Williams, I got squared away.

Immediately after that, I could tell that I was not going to be able to use BCS for what I needed.  In Office 365, there is a hard limit of 2000 items that can be retrieved.  Ironically, the error message that is displayed is not a supported cmdlet for Office 365.

Office 365 BCS Throttle Error
Adding insult to injury, I decided to run a simple test using JSOM.  I wanted to clarify if JSOM would provide me PagingInfo with a BCS List.  Using the code below, you'd expect line 68 to produce a value instead of nothing at all.

Since BCS will not work and due to the limitation of the API, I had to look for another solution.

Custom List

Using a custom list initially worked great.  I'm able to use JSOM, query the list for 5000 items per trip to the server, AND get PagingInfo.  Using the code below works great for this scenario.

Importing the Excel into Office 365 is relatively straightforward and will work for most needs.  The file I used had about ~7600 rows of data.  After importing the file, I noticed the Server Resource Quota was tapped.

Office 365 Server Utilization
So using this approach has 2 problems.  I will not be able to do a mass import again of my data (the list already exists) and the Server Resource Quota points are exhausted.

Excel REST

This seemed like a cool way of getting around the limitations above, so I dove in to find out if this will work for my needs.  After all, I'm allowed to have a *lot* of rows in Excel and I'll be able to easily update the file, since it's in a document library.  Using the code below, I ran into a showstopper though.

There is a hard limit in the API set at 500 rows.  That would be painfully slow to get all of the items or even worse; a user may try to use the form control while this is still querying for data.

Excel REST API - 500 row limit
So that leaves us with our raw data that was exported from SQL and given to us to use.


Updating the CSV will be easy, since it will be stored in a document library.  Now all we need to be able to do is make sense of it.  Using the code below, I'm able to parse the CSV and create an array of objects that I need to pass off to another library.  Also note the use of localStorage.  This is a nice way to cache the data and prevent the retrieval/processing of the data client-side on every page load.  If the CSV is updated, simply clear the browser cache and you'll get the latest and greatest.


All approaches have their merits and pitfalls... BCS and PagingInfo, I'm looking at you!  If the ad-hoc mass-import wasn't needed, then using list driven data would have been my choice.  If I used that approach, I would have still used localStorage though.  It makes sense to cache the processed data since it'll not change very much.  Since my solution works client-side, I'll have to take into consideration the amount of time this takes to render.  I'm getting good performance out of the CSV approach, so I'm going to stick with it for the time being.