Tuesday, April 15, 2014

Large Lists, BCS, Excel REST, JSOM, CSVs and Office 365 OH MY!

An interesting challenge came across my desk last week:
"How can we import a CSV into Office 365 and use that data to tag other items within the site.  The CSV currently has ~7600 rows and is expected to grow.  We'll also need to re-import the CSV on an ad-hoc basis when the data changes."
The last part of that was the real issue.  There was simply no easy way of doing that.  After trying a few things, falling flat on my face, I'm prepared to share my exploration into the different options.

Each have their own merits and pitfalls.  This post will examine each and try to shed some light on the pitfalls I've found using each of these.

BCS

Wiring up an ECT on Office 365 can be a little finicky. I initially had some issues due to 2 BDC Models that were created for the same ECT.  After calling in the eagle-eyed BCS guru, +Fabian Williams, I got squared away.

Immediately after that, I could tell that I was not going to be able to use BCS for what I needed.  In Office 365, there is a hard limit of 2000 items that can be retrieved.  Ironically, the error message that is displayed is not a supported cmdlet for Office 365.

Office 365 BCS Throttle Error
Adding insult to injury, I decided to run a simple test using JSOM.  I wanted to clarify if JSOM would provide me PagingInfo with a BCS List.  Using the code below, you'd expect line 68 to produce a value instead of nothing at all.


Since BCS will not work and due to the limitation of the API, I had to look for another solution.

Custom List

Using a custom list initially worked great.  I'm able to use JSOM, query the list for 5000 items per trip to the server, AND get PagingInfo.  Using the code below works great for this scenario.


Importing the Excel into Office 365 is relatively straightforward and will work for most needs.  The file I used had about ~7600 rows of data.  After importing the file, I noticed the Server Resource Quota was tapped.

Office 365 Server Utilization
So using this approach has 2 problems.  I will not be able to do a mass import again of my data (the list already exists) and the Server Resource Quota points are exhausted.

Excel REST

This seemed like a cool way of getting around the limitations above, so I dove in to find out if this will work for my needs.  After all, I'm allowed to have a *lot* of rows in Excel and I'll be able to easily update the file, since it's in a document library.  Using the code below, I ran into a showstopper though.


There is a hard limit in the API set at 500 rows.  That would be painfully slow to get all of the items or even worse; a user may try to use the form control while this is still querying for data.

Excel REST API - 500 row limit
So that leaves us with our raw data that was exported from SQL and given to us to use.

CSV

Updating the CSV will be easy, since it will be stored in a document library.  Now all we need to be able to do is make sense of it.  Using the code below, I'm able to parse the CSV and create an array of objects that I need to pass off to another library.  Also note the use of localStorage.  This is a nice way to cache the data and prevent the retrieval/processing of the data client-side on every page load.  If the CSV is updated, simply clear the browser cache and you'll get the latest and greatest.

Conclusion

All approaches have their merits and pitfalls... BCS and PagingInfo, I'm looking at you!  If the ad-hoc mass-import wasn't needed, then using list driven data would have been my choice.  If I used that approach, I would have still used localStorage though.  It makes sense to cache the processed data since it'll not change very much.  Since my solution works client-side, I'll have to take into consideration the amount of time this takes to render.  I'm getting good performance out of the CSV approach, so I'm going to stick with it for the time being.

7 comments:

Unknown said...

That's quite a write up. Excellent blog post, this is something I know I will keep in my tool bag. Thanks.

Unknown said...

Thanks again for your help Fabian. I'm throwing the gauntlet down: Let's get BCS paging to work using SPD only and return more than 2000 items.

Patrick Curran said...

This is an excellent article on lessons learned and the best solution found. This is a great solution and explanation. Nicely done Matthew and thank you for sharing.

Unknown said...

Wow, that is pretty amazing stuff. Thanks for taking the time to write it up and share it with everyone.

I just realized you are the guy I use to see at the .net user groups in DC too, lol

Unknown said...

@Patrick
If we can get paging to work with BCS in SPD, I'm sure the API will catch up. Once both of those pieces come together, I feel like we'll have a great way to provide solutions moving forward.

@Matthew
Great name! ;) I'm not sure if I remember you, but I'll be on the lookout for you next time I attend. Be sure to say "Hi".

Christophe said...

Great post Matt! Too often we see sample code that works great for 10 items, and breaks as soon as you tackle a real life scenario.

I don't see the end of the story, out of curiosity how are you binding these tags to the target list(s)? Via a simple text column?

Unknown said...

You're exactly right Christophe. I'm using simple text columns for the values. I had to add 4 or 5 of them to my list.

Since BCS and the Custom List option offer the ability to use Projected Fields, I had to imitate that with the CSV approach. So, I added typeahead.js to the form and made the fields that were going to be updated programmatically read only.

Thanks for taking the time out and reading this over. I'm always impressed by your work. :P