Creating multiple db connections to get data in Parallel in C# asp.net entity framework

55 views Asked by At

I have a scenario, where in I have to fetch 18000 records from db, and convert it into a DTO. The total number of refIds = 18000 unique ids of the reference table in database.

int batchSize = 1000;
var refBatchList = new List<List<int>>();

for (var i = 0; i < refIds.Count; i += batchSize)
{
    refBatchList.Add(refIds.Skip(i).Take(batchSize).ToList());
}


var dbDataRetrievalTasks =  refBatchList.Select(async refIdsBatch => await GetReferencesAsync(refIdsBatch, new data.Entities()));
var results = await Task.WhenAll(dbDataRetrievalTasks);

The above section of code, creates multiple tasks which gets executed in parallel and calls the below method that retrieves data from db asynchronously.

private async Task<List<reference>> GetReferencesAsync(List<int> ids, JIVE.data.Entities db)
{
    var result = await db.Set<reference>()
        .Where(x => ids.Contains(x.reference_id))
        .AsNoTracking()
        .ToListAsync();
    return result;
}

My question is, Since I am planning to make db calls in parallel, I am using multiple db connections. In these scenario, there would be 18 different db connections as there are 18 batches.

So I am wondering is it okay to retrieve data like this from database. what are the cons that are associated with this method? Is there any better way to achieve this.

I know if there more batches, tasks.whenAll() would create multiple tasks which could throttle the thread pool. But in my scenario the maximum batch that I could get is 20.

1

There are 1 answers

3
Steve Py On

For larger sets of IDs, have a look at AsQueryableValues(set) (https://github.com/yv989c/BlazarTech.QueryableValues) which utilizes OPENJSON behind the scenes. This would allow you to reasonably retrieve the desired 18k rows in one call.

var references = db.Set<reference>()
    .Where(x => db.AsQueryableValues(refIds).Contains(x.reference_id))
    .AsNoTracking()
    .ToListAsync();

This could be improved further in a Read scenario by projecting to to a view model / DTO with just the columns you need from the reference. (For instance if References are a significant # of columns of various sizes but you only need a handful of columns)