Today we discuss some more of the .Net APIs from the framework class library. Specifically we discuss the task parallel library. We first look at data and task parallelism and then we discuss some of the potential pitfalls we need to watch out for.
Data parallelism refers to scenarios in which the same operation is performed concurrently on say elements in a source collection. The collection is partitioned so that multiple threads can work on it. The for loop can be partitioned with a lambda expression.
Parallel.ForEach( sourceCollection, item => Process(item));
There are options available to stop or break loop execution, monitor the state of the loop, maintain thread local state, finalize thread-local objects, control the degree of concurrency and so on.
Task Parallelism is based on the concept of a task which represents an asynchronous operation and resembles a thread or thread pool without the corresponding OS primitives but as an abstraction
This enables more efficient and more scalable use of system resources.
It also enables more programmatic control than is possible in a thread or work item.
Parallel.Invoke() method enables creating and running tasks implicitly. As an example:
Parallel.Invoke(() => DoSomeWork(), () => DoSomeOtherWork());
The number of task instances for the above statement does not have to match the number of parameters to Invoke method.
For greater control over the tasks, we can use Task.Wait and Task.Run method.
The potential pitfalls in Data and Task Parallelism are:
we cannot assume that parallel is always faster. If the parallel loops have few iterations and fast user delegates, then it is unlikely to speed up much.
We should not write to shared memory locations because they increase the race conditions and contention.
We should avoid doing over-parallelization. Each partitioning incurs a cost and there is cost to synchronize the worker threads.
We should avoid calls to non-thread safe methods because it can lead to data corruption which may go undetected.
We should also limit the calls to thread safe methods because there can be significant synchronization involved.
Delegates used with Parallel.Invoke should not have waiting because it affects the execution of others and can result in deadlocks when the tasks are inlined to the currently executing thread.
We should also not assume that iterations in the loop specified always execute in parallel.
We should avoid executing parallel tasks on UI threads.
Data parallelism refers to scenarios in which the same operation is performed concurrently on say elements in a source collection. The collection is partitioned so that multiple threads can work on it. The for loop can be partitioned with a lambda expression.
Parallel.ForEach( sourceCollection, item => Process(item));
There are options available to stop or break loop execution, monitor the state of the loop, maintain thread local state, finalize thread-local objects, control the degree of concurrency and so on.
Task Parallelism is based on the concept of a task which represents an asynchronous operation and resembles a thread or thread pool without the corresponding OS primitives but as an abstraction
This enables more efficient and more scalable use of system resources.
It also enables more programmatic control than is possible in a thread or work item.
Parallel.Invoke() method enables creating and running tasks implicitly. As an example:
Parallel.Invoke(() => DoSomeWork(), () => DoSomeOtherWork());
The number of task instances for the above statement does not have to match the number of parameters to Invoke method.
For greater control over the tasks, we can use Task.Wait and Task.Run method.
The potential pitfalls in Data and Task Parallelism are:
we cannot assume that parallel is always faster. If the parallel loops have few iterations and fast user delegates, then it is unlikely to speed up much.
We should not write to shared memory locations because they increase the race conditions and contention.
We should avoid doing over-parallelization. Each partitioning incurs a cost and there is cost to synchronize the worker threads.
We should avoid calls to non-thread safe methods because it can lead to data corruption which may go undetected.
We should also limit the calls to thread safe methods because there can be significant synchronization involved.
Delegates used with Parallel.Invoke should not have waiting because it affects the execution of others and can result in deadlocks when the tasks are inlined to the currently executing thread.
We should also not assume that iterations in the loop specified always execute in parallel.
We should avoid executing parallel tasks on UI threads.
No comments:
Post a Comment