Friday 23 October 2015

Best Practices in SharePoint Client Side Object Model

Following are the Best Practices that should be followed while coding for SharePoint Client Object Model:

1.)Only Request What You Want (But request everything you want in one go!)

You must explicitly request every property of every object you want. This is really a fundamental basic of CSOM programming - after all, it's designed to be used across a network. If you only need a user's Title and LoginName, then only request those:
var spUser = clientContext.Web.CurrentUser;
clientContext.Load(spUser, user => user.Title, user => user.LoginName);
However, if later on in your code, you sometimes need to send the user an email, then add their email address to the earlier request. Don't go back to the server twice! The cost of always requesting an additional property is miniscule in comparison to going all the way to the server and back an extra time later on.

We should minimize the number of times we hit the SharePoint Server.

2.) Call ExecuteQuery Sparingly

Again, an obvious one. But there are scenarios where you will be calling ExecuteQuery where you don't need to. In case you didn't know, ExecuteQuery is the method that causes all of your requests to go to the server in a single batch, so it's slow!
Take a look at this scenario. If you're creating a list, you might think you need to write code as follows:
List list = web.Lists.Add(...);
ctx.ExecuteQuery(); //Create the list
ctx.Load(list, l => l.DefaultViewUrl); // Request the new list's URL
ctx.ExecuteQuery(); // Get the new list's DefaultViewUrl
In fact, you don't need that first ExecuteQuery. It's not intuitive, but you can create the list, get its URL, and submit both requests in one go:
List list = web.Lists.Add(...);
ctx.Load(list, l => l.DefaultViewUrl);
ctx.ExecuteQuery(); // Get the new list's DefaultViewUrl
A slightly more convoluted example involves a scenario where you are indirectly calling some CSOM code, but it's behind an interface and you may be calling it multiple times. How can you protect against an ExecuteQuerycall every time this interface method is called? For example:
public interface IData { }
public class MyDataClass : IData { }

public interface IDataRetriever
{
   IData GetData(string id);
}

public class SPDataRetriever : IDataRetriever
{
   public IData GetData(string id)
   {
       //Make whatever CSOM requests you need
       ListItem li = _list.GetItemById(id);
       ctx.ExecuteQuery();
       return new MyDataClass(li);
   }
}
In our scenario, it's being consumed as follows:
data = ids.Select(id => dataRetriever.GetData(id));
Clearly, this is very inefficient since ExecuteQuery is called for every item in the enumerable. Let's refactor the code to remove the ExecuteQuery call:
public interface IDataRetriever
{
   void RequestData(string id);
   IEnumerable<IData> GetAvailableData();
}

public class SPDataRetriever : IDataRetriever
{
   private Queue<ListItem> _queue = new Queue<ListItem>();

   public void RequestData(string id)
   {
       //Make whatever CSOM requests you need
       ListItem li = _list.GetItemById(id);
       _queue.Enqueue(li);
   }

   public IEnumerable<IData> GetAvailableData()
   {
       var result = _queue.Select(li => new MyDataClass(li)).ToArray();
       
       _queue.Clear();
       return result;
   }
}
You can see it's now split into two methods: RequestData, which "enqueues" requests ready to be sent to the server - and GetAvailableData, which returns data assuming that ExecuteQuery has now been called.
We consume this code as follows:
IDataRetriever dataRetriever1 = new SPDataRetriever(ctx);
IDataRetriever dataRetriever2 = new SPDataRetriever(ctx);

foreach (string id in new[] { "id1", "id2" })
   dataRetriever1.RequestData(id);
           
foreach (string id in new[] { "id1", "id2" })
   dataRetriever2.RequestData(id);

ctx.ExecuteQuery(); //Single call to execute query

IEnumerable<IData> allData = dataRetriever1.GetAvailableData().Concat(dataRetriever2.GetAvailableData());
This is an example of the type of creative way you can help prevent performance issues by minimising the number of times you call ExecuteQuery.

3. ) Caching Data in the Session

If every page in your app is requesting the same data from SharePoint, then you can store it temporarily in the user session cache. This will prevent you having to make a round trip to the SharePoint server on every page request. Additionally, since it's in the user session cache, it is scoped to each user individually. If you want to cache application-wide data, you can store it in the Application cache. See this MSDN article for more information.
So, let's assume a scenario where each page checks whether or not the user is allowed access to it based on whether they are a site administrator:
public bool CheckPrivileges()
{
    var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
    using (var clientContext = spContext.CreateUserClientContextForSPHost())
    {
        var currentUser = clientContext.Web.CurrentUser;
        clientContext.Load(currentUser, u => u.IsSiteAdmin);
        clientContext.ExecuteQuery();

        return currentUser.IsSiteAdmin;
    }
}
We can simply wrap that CheckPrivileges method in another that performs session caching:
public bool CheckPrivilegesWithSessionCaching(HttpContextBase httpContext)
{
    string key = "IsSiteAdmin";
    var keys = httpContext.Session.Keys.Cast<string>().ToList();
    if(keys.Contains(key))
    {
        return (bool)httpContext.Session[key];
    }
    else
    {
        bool result = CheckPrivileges(httpContext);
        httpContext.Session[key] = result;
        return result;
    }
}
Note that if you're storing large amounts of data, this solution won't scale well (since it is 'In Memory' caching) - you could store cached data in your database instead.
Additionally, you cannot assume that data will be available in the session cache - it could be cleared by ASP.NET at any time, or, due to load balancing it could be cached differently on separate servers. As long as you go back to SharePoint to retrieve it when necessary, this shouldn't be a problem.

4.)Use CAML Query to Filter Retrieval of List Items

If you're retrieving some items from a list, it is tempting to retrieve all items and then filter it in your code. However, you can use CAML queries to perform the filtering server side. It can be a little bit awkward (coding in XML) but it's worth getting right for the potential speed increases you'll see, especially for large lists.
For example, this is the lazy way to get list items:
CamlQuery query = CamlQuery.CreateAllItemsQuery();
var items = list.GetItems(query);
And here's a formatted CAML query with a where clause:
CamlQuery query = new CamlQuery() 
{ 
    ViewXml = string.Format("<View><Query><Where><Eq><FieldRef Name='{0}' /><Value Type='String'>{1}</Value></Eq></Where></Query></View>", 
                        "FirstName", "Eric") 
};
 
var items = list.GetItems(query);
Note the 'View' outer tag which is required when querying with CSOM, unlike the server object model version.
Here's another trick – you can actually get all folders, subfolders and/or files from a document library, in a single query, by specifying RecursiveAll:
CamlQuery allFoldersCamlQuery = new CamlQuery()
{
    ViewXml = "<View Scope='RecursiveAll'>"
                        + "<Query>"
                        + "<Where>"
                        + "<Eq><FieldRef Name='FSObjType' /><Value Type='Integer'>1</Value></Eq>"
                        + "</Where>"
                        + "</Query>"
                    + "</View>"
};
In the above query, Scope is set to RecursiveAll. Also, I'm setting the field FSObjType=1 - this means that only folders are returned. If you want only items, set FSObjType=0. If you want both files and folders, omit it entirely.
You can actually go even further – retrieving all items from multiple lists, by enumerating through the lists and using a caml query on each. The important thing is that you only call ExecuteQuery once, at the end.

5. )Advanced: Parallel and Async Code

If you're using JavaScript or the CSOM library in Silverlight or Windows Phone, then you'll see that you have access to ExecuteQueryAsync. Unfortunately for the rest of us in .NET, there's only an ExecuteQuery method – synchronous. I don't know why.
What if you want to do a database request, or a different web request, or get user input, or something else at the same time as your CSOM request? Wouldn't it be handy to have that ExecuteQueryAsync? Let's see if we can create one:
public static class CSOMExtensions
{
    public static Task ExecuteQueryAsync(this ClientContext clientContext)
    {
        return Task.Factory.StartNew(() =>
        {
            clientContext.ExecuteQuery();
        });
    }
}
Well, that was a little easier than expected!
Note that this code might need a little optimisation for your particular circumstances. For example, usingTask.Factory.StartNew doesn't always create a new thread. If you're using a lot of these concurrently, you might want to avoid using the thread pool. You can read more about Task here.
Now we can do super cool parallel and async stuff! Check this out:
public async Task<ActionResult> Index()
{
    var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
    using (var clientContext = spContext.CreateUserClientContextForSPHost())
    {
        if (clientContext != null)
        {
            var currentUser = clientContext.Web.CurrentUser;
 
            clientContext.Load(currentUser, u => u.Title);
            Task t1 = clientContext.ExecuteQueryAsync();
 
            clientContext.Load(currentUser, u => u.Email);
            Task t2 = clientContext.ExecuteQueryAsync();
 
            await t1.ContinueWith((t) =>
            {
                ViewBag.UserName = currentUser.Title;
            });
 
            await t2.ContinueWith((t) =>
            {
                ViewBag.Email = currentUser.Email;
            });
        }
    }
    return View();
}
OK, so this is a contrived and pointless example because if you'd been paying attention, you'd know we should only call ExecuteQuery once! But let's walk through it:
  1. Firstly, the method signature has changed. This controller method is now async, and it returns a Task. This means we can now use the await keyword within the method.
  2. Within the body, we're loading Title and Email. We're calling ExecuteQueryAsync, which start (and return) new Task objects.
  3. We call ContinueWith on the Task object to run code when it completes – namely, using the property that the CSOM code requested.
  4. We use the await keyword to signify that the code is asynchronous and the Index controller method should depend on the result of this code.
Consider this next example which is a controller method called List. It's a bit more sensible: we're retrieving both database data and SharePoint list items concurrently, and combining them into ViewModel objects to return to the client.
public async Task<ActionResult> List()
{
    List<ViewModel> result = new List<ViewModel>();
 
    var spContext = SharePointContextProvider.Current.GetSharePointContext(HttpContext);
    using (var clientContext = spContext.CreateUserClientContextForSPHost())
    {
        //Form the query for ListItems
        var listItems = clientContext.Web.Lists.GetByTitle(listTitle).GetItems(camlQuery);
 
        //Send the queries
        var clientTask = DB.Clients.ToListAsync();
        var spTask = clientContext.ExecuteQueryAsync();
 
        //Wait for both to complete
        await Task.WhenAll(clientTask, spTask);
 
        result = clientTask.Result.Select(c => new ViewModel(listItems)).ToList();
    }
 
    return View(result);
}
In the code above, we've only got a single blocking call instead of two, and we could potentially be doubling the speed of the method.
Well, that's all I've got for now. Clearly the client-side object model is easily abused in terms of performance, and it's very easy to suddenly realise that performance is a problem.


No comments: