Cluster computing

Saturday, February 15, 2014

What's a good way to scale the concurrent processing of a queue both on the producer and consumer side ? This seems a text book question but think about support on cross platform and high performance. Maybe if we narrow down our question to windows platform, that would help. Anyways, its the number of concurrent workers we can have for a queue. The general rule of thumb was that you could have as many threads as one more than the number of processors to keep everyone busy. And if its a light weight worker without any overhead of TLS storage, we could scale to as many as we want. The virtual workers can use the same pool of physical threads. I'm not talking fibers which don't have the TLS storage as well. Fibers are definitely welcome over OS threads. But I'm looking at a way to parallelize as much as we can in terms of number of concurrent calls on the same system.
In addition we consider the inter worker communication both in a failsafe, reliable manner. OS provides mechanisms for thread completion based on handles returned from the CreateThread and then there's a completion port on windows that could be used with multiple threads. The threads can then close when the port is closed.
Maybe the best way to do this would be to keep a pool of threads and partitioned tasks instead of timeslicing.
Time-slicing or sharing does not really improve concurrent progress if the tasks are partitioned.
Again, it helps if the tasks are partitioned. The .Net task parallel library enables both declarative parallel queries and imperative parallel algorithms. By declarative we mean we can use notations such as 'AsParallel' to indicate we want routines to be executed in parallel. By Imperative we mean we can use partitions, permutations and combinations with linear data.
In general a worker helps with parallelization when it has little or no communication and works on isolated data.
I want to mention diagnostics and boost. Diagnostics on a workers activity to detect hangs or for identifying a worker among a set of workers are enabled with such things as logging or tracing and
identifiers for workers. Call level logging and tracing enable detection of activity by a worker. Between a set of workers, IDs can tell apart a worker from the test. Logging can include this ID to detect the worker with a problem activity.
There can also be a dedicated activity or worker to monitor others.
Boosting a workers performance is in terms of cpu speed and memory. Both are variables that depend on hardware. Memory and caches come very helpful in improving the performance of the same activity by a worker.

Cluster computing

Saturday, February 15, 2014

No comments:

Post a Comment