Monday, 29 May 2017

Sequential, Multi threaded, Parallel and Asynchronous programming


In this article, I am going to explain different ways of programming in computer technology.

What is Process? - The process is what the operating system uses to facilitate the execution of a program by providing the resources required. Each process has a unique process Id associated with it. You can view the process within which a program is being executed using windows task manager.

Sequential programming
It involves a consecutive and order execution processes one after other, in the real world if you create any new application by default it works as sequential processing.

Example1:  There are 3 tasks to complete each task takes 10 sec to complete, main thread or application executes one after the other, it takes 30 seconds to complete.

Example 2: Hamburger restaurant.
This post will explain this difference through a simple (non-computer oriented) example.
Suppose you are starting a hamburger fast food restaurant. In your restaurant, you have only one dish: A hamburger. It is made by strictly following these steps:
  1. When a customer comes in the clerk takes his order (10 sec).
  2. The clerk puts a meatball on the grill and roasts it (30 sec).
  3. The clerk puts the buns in the oven and warms them (30 sec).
  4. The clerk assembles the hamburger and serves it to the customer (10 sec).
In our example we have two critical resources (marked bold), the grill and the oven. When each of these devices is in use, no one else can use them. The clerk represents our main process or (main thread) and the recipe our code. The clerk follows the recipe to the letter in the same way our code is executing on a CPU.

What happens in a single threaded environment or Sequential programming?
We have one clerk. When he puts the meatball in the grill he has to wait near the grill (doing nothing) while the meatball is fully roasted, then he takes the meatball out and can proceed to the oven. He puts the buns in the oven and again… waits… Until the buns are warm. The entire hamburger making process takes 80 seconds (10 + 30 + 30 + 10).
This is not bad for a single customer. In fact, the fastest any customer can be served a hamburger is 80 seconds so in the case when only one customer comes to the restaurant we are doing a perfect job serving him fast. The problem starts when there are several customers coming in. Consider the case when two customers are coming in together. The clerk has to follow the recipe for both customers, the first customer  gets a hamburger after 80 seconds while the second customer after 160 seconds (since the clerk can start making the second hamburger only after he is done with the first). Each additional customer would have to wait additional 80 seconds for any other customer standing before him in the line. The next diagram depicts serving one customer by a single clerk.


To solve the above  mentioned problem we usually use a multithreaded model.


Multi-threaded programming
Thread – thread is execution path of a program. Each thread defines a unique flow of control.
The thread is a light weight process. A process has at least one thread which is commonly called as main thread which actually executes the application code. A single process can have multiple threads.

Example:
public static void CallToChildThread()
      {
         Console.WriteLine("Child thread starts");
      }     
      static void Main(string[] args)
      {
         ThreadStart childref = new ThreadStart(CallToChildThread);
         Console.WriteLine("In Main: Creating the Child thread");
         Thread childThread = new Thread(childref);
         childThread.Start();
         Console.ReadKey();
      }

The real concern surrounding threads is not about a single sequential thread, but rather the use of multiple threads in a single program all running at the same time and performing different tasks. This mechanism referred as Multithreading. A thread is considered to be a lightweight process because it runs within the context of a program and takes advantage of resources allocated for that program.

Foreground Thread:
A managed thread is either a background thread or a foreground thread. Foreground thread executes event if main thread exists.
Example: child thread is assigned to function it takes 30 seconds to execute, if main thread terminates or exists before 30 seconds foreground thread still executes.

Background Thread:
Background threads stops execution once main thread exits or terminates.
The process of the application keeps running as long as at least one foreground thread is running. If more than one foreground thread is running and the Main() method ends, the process of the application keeps active until all foreground threads finish their work.

Thread safe
In multi threaded environment if 2 or more threads are trying to access and execute shared resources then we get unwanted behavior so we need to make our threads should be thread safe.
We can achieve thread safe by following ways
                      Lock
static object _lock = new object();
public static void AddOneMillion()
{
 for (int i = 1; i [= 1000000; i++)
{
lock (_lock)
{
Total++;
 }
}
}

Monitor
static object _lock = new object();
public static void AddOneMillion()
{
for (int i = 1; i [= 1000000; i++)
{
// Acquires the exclusive
Monitor.Enter(_lock);
try { Total++; }
finally { // Releases the exclusive lock
 Monitor.Exit(_lock); }
}
}

Deadlock
Scenario when a deadlock can occur, Let's say we have 2 threads
a) Thread 1
b) Thread 2
and 2 resources
a) Resource 1
b) Resource 2

Thread 1 has already acquired a lock on Resource 1 and wants to acquire a lock on Resource 2. At the same time Thread 2 has already acquired a lock on Resource 2 and wants to acquire a lock on Resource 1. Two threads never give up their locks, hence a deadlock.
There are several techniques to avoid and resolve deadlocks.
1. Acquiring locks in a specific defined order
2. Mutex class
A mutux object only allows one thread into controlled section, forcing other threads which attempts to gain access to that section to wait until the first thread has exited from that section.

3. Monitor.TryEnter() method
4.Semaphore
The semaphore class works similar to the Monitor and Mutex class but lets you set a limit on how many threads have access to a critical section. It's often described as a nightclub (the semaphore) where the visitors (threads) stands in a queue outside the nightclub waiting for someone to leave in order to gain entrance.

Thread Pooling
Thread pooling is the process of creating a collection of threads during the initialization of a multithreaded application, and then reusing those threads for new tasks as and when required, instead of creating new threads.

Example1:
(Continue above example2) we simply add another clerk. In this case when two customers come into the restaurant each customer will be served by a different clerk. On first glance you might think that now each customer gets a hamburger after 80 seconds but this is not the case. Remember the critical resources? Both clerks take the order at the same time but only one of the clerks can use the grill (while the other waits for the grill to be available). The diagram for two clerks serving two customers looks like this:

The first customer still gets the hamburger after 80 seconds but the second customer gets his after 110 seconds, 50 seconds faster than in the single threaded case. From this example we can deduce two things:
  1. If more than two customers come we need more clerks. Clearly, there is a limit to the number of clerks we can have (and that limit is probably not very high). More clerks mean a crowded kitchen and at some point they start bumping into each other and even taking the wrong meatballs from the grill. This happens in software too. The more threads you have the more time you computer spends on switching between them and the more time you spend on debugging various race conditions .
  2. No matter how many clerks we hire we still have only one grill and one oven so at some point adding clerks will have no added value (in fact it will just cause more problems, see item 1). This happens in software too. So what do we do? We buy more grills… errr… RAM, CPUs, storage, bandwidth etc... Our handling capacity is directly linked to the hardware upgrades we are making.
Advantages of multithreading:
1. To maintain a responsive user interface
2. To make efficient use of processor time while waiting for I/O operations to complete.

Disadvantages of multithreading:
1. On a single-core/processor machine threading can affect performance negatively as there is overhead involved with context-switching.
2. Have to write more lines of code to accomplish the same task.
3. Multithreaded applications are difficult to write, understand, debug and maintain.

Please Note: Only use multithreading when the advantages of doing so outweigh the disadvantages.



Parallel Programming
it’s actually not a synonym for multithreaded programming. Parallel programming covers a wider spectrum and refers to the ability to have multiple tasks going on concurrently. Concurrent does not necessarily mean multithreaded. It can mean grid computing on many machines. In fact, even the term multithreaded is often misunderstood. Threads on a CPU are a bit different than threads managed by the CLR using the thread pool. In that case, threads are more than likely not even truly simultaneous but more akin to time slicing, since more than likely the number of active threads will exceed the number of cores on the machine. Even in today’s day and age, a CPU (or core) is capable of doing only one thing at any one time. As .NET developers, we are provided the luxury of a rich toolset that lets our applications give their users the perception that many things are occurring at the same time. This appearance is more important than the deception it implies. An application’s ability to demonstrate to the user that many things are happening at the same time is often more important than its true measure of speed. In fact, with multithreading often is the case where the overhead of thread management actually affects the performance of the code execution in a negative way.
 or
The issue with the processor load has a connection with parallel programming in the sense that parallel programming aims to keep all computational elements as busy as possible. But simply keeping the CPU busy does not mean that you are doing parallel programming.
Lastly, parallel programming extends well beyond multithreading and can take place among processes running on the same machine or on different machines.
To take advantage of multiple cores from our software, ultimately threads have to be used. Because of this fact, some developers fall in the trap of equating multithreading to parallelism. That is not accurate.

You can have multithreading on a single core machine, but you can only have parallelism on a multi core machine (or multi-proc, but I treat them the same). The quick test: If on a single core machine you are using threads and it makes perfect sense for your scenario, then you are not "doing parallelism", you are just doing multithreading. If that same code runs on a multi-core machine, any overall speedups that you may observe are accidental – you did not "think parallelism".

Task and Task Parallel Library (TPL)
 Microsoft launched Task Parallel Library(TPL) which mixes features of Thread(notification) & ThreadPool (auto management of threads). Task uses a TaskScheduler which schedules the job work and works on ThreadPool. It is the most advanced and managed approach to achieve concurrency and has following features -
  • Task continuation option (ContinueWith() )
  • Task<T> - a generic return type used for passing results
  • Wait() option features for synchronous job work and used to for waiting of  result
  • Facility to spin long running job-work on new Thread rather than ThreadPool
There is Microsoft's BCL library which provides Parallel processing capability throughParallel.For* , PLINQ which makes concurrent mechanism more approachable and easy.
The Task Parallel Library (TPL) is based on the concept of a task, which represents an asynchronous operation. In some ways, a task resembles a thread or ThreadPool work item, but at a higher level of abstraction. The term task parallelism refers to one or more independent tasks running concurrently. Tasks provide two primary benefits:
  • More efficient and more scalable use of system resources.
Behind the scenes, tasks are queued to the ThreadPool, which has been enhanced with algorithms that determine and adjust to the number of threads and that provide load balancing to maximize throughput. This makes tasks relatively lightweight, and you can create many of them to enable fine-grained parallelism.
  • More programmatic control than is possible with a thread or work item.
Not only do you get the execution isolation that comes with threading, you get functionality that makes programming threads a lot easier.
Finally, the Task class from the Task Parallel Library offers the best of both worlds. Like the ThreadPool, a task does not create its own OS thread. Instead, tasks are executed by a TaskScheduler; the default scheduler simply runs on the ThreadPool.

you’ll first need to add the following using statement:
using System.Threading.Tasks;
//The most direct way
Task.Factory.StartNew(() => {Console.WriteLine("Hello Task library!"); });
//Using Action
Task task = new Task(new Action(PrintMessage));
task.Start();
//where PrintMessage is a method:
private void PrintMessage()
{
    Console.WriteLine("Hello Task library!");
}
//Using a delegate
Task task = new Task(delegate { PrintMessage(); });
task.Start();

//Lambda and named method
Task task = new Task( () => PrintMessage() );
task.Start();
//Lambda and anonymous method
Task task = new Task( () => { PrintMessage(); } );
task.Start();

//Using Task.Run in .NET4.5
public async Task DoWork()
{
    await Task.Run(() => PrintMessage());
}

//Using Task.FromResult in .NET4.5 to return a result from a Task
public async Task DoWork()
{
    int res = await Task.FromResult<int>(GetSum(4, 5));  
}
private int GetSum(int a, int b){
    return a + b;
}
You cannot start a task that has already completed. If you need to run the same task you’ll need to initialize it again.

Non-cooperative cancellation -> Task do cooperate to end their life-cycle: Tasks support cooperative cancellation if you want to cancel them mid-way. It is very simple. In our application, click on the button with caption "Tasks with Cancellation" to kick off the summation process of first 1 million natural numbers. This time all the ten tasks that we invoke for getting partial sums support cancellation. Here is the signature of that method. Look at the third parameter of the function.


private static long AddNumbersBetweenLimitsWithCancellation(long lowerLimitInclusive, long upperLimitInclusive, CancellationToken token)

Every time before doing subsequent summation they check a flag IsCancellationRequested of the cancellation token that was passed initially to the function when it started. Now, considering this summation is a long running task you might want to cancel it mid-way. So click on the button with caption "Cancel Tasks". It results in IsCancellationRequested property of the token to be set to true.  This results in functions performing partial summation to immediately return instead of continuing any further.
//some code has been removed for brevity.
       //please refer attached source code for complete reference.
       CancellationTokenSource ts = new CancellationTokenSource();
 
        private void btnCancelTasks_Click(object sender, EventArgs e)
        {
            ts.Cancel();
        }
 
        private void btnTasksWithCancellation_Click(object sender, EventArgs e)
        {
            long degreeofParallelism = 10;
            long lowerbound = 0;
            long upperBound = 0;
            List<Task<long>> tasks = new List<Task<long>>();
            long countOfNumbersToBeAddedByOneTask = 100000; //1 lakh
 
            for (int spawnedThreadNumber = 1; spawnedThreadNumber <= degreeofParallelism; spawnedThreadNumber++)
            {
                lowerbound = ++upperBound;
                upperBound = countOfNumbersToBeAddedByOneTask * spawnedThreadNumber;
                //copying the values to be passed to task in local variables to avoid closure variable
                //issue. You can safely ignore this concept for now to avoid a detour. For now you
                //can assume I've done bad programming by creating two new local variables unnecessarily.
                var lowerLimit = lowerbound;
                var upperLimit = upperBound;
 
                tasks.Add(Task.Run(() => AddNumbersBetweenLimitsWithCancellation(lowerLimit, upperLimit, ts.Token)));
 
            }
 
            Task.WhenAll(tasks).ContinueWith(task => CreateFinalSumWithCancellationHandling(tasks));
        }
 
 
        private static long AddNumbersBetweenLimitsWithCancellation(long lowerLimitInclusive, long upperLimitInclusive, CancellationToken token)
        {
            long sumTotal = 0;
            for (long i = lowerLimitInclusive; i <= upperLimitInclusive; i++)
            {
                //deliberately added a sleep statement to emulate a long running task
                //this will give the user a chance to cancel the partial summation tasks in the middle when they are not yet complete.
                Thread.Sleep(1000);
                if (token.IsCancellationRequested)
                {
                    sumTotal = -1;//set some invalid value so that calling function can detect that method was cancelled mid way.
                    break;
                }
                sumTotal += i;
            }
 
            return sumTotal;
        }
 
        private static void CreateFinalSumWithCancellationHandling(List<Task<long>> tasks)
        {
            long grandTotal = 0;
            foreach (var task in tasks)
            {
                if (task.Result < 0)
                {
                    MessageBox.Show("Task was cancelled mid way. Sum opertion couldn't complete.");
                    return;
                }
                grandTotal += task.Result;
            }
            var finalValue = tasks.Sum(task => task.Result);
            //Did you require a context switch to UI worker thread here before showing the
            //MessageBox control which is a UI element. What would have you done if you were NOT using TPL.
            MessageBox.Show("Sum is : " + finalValue);
        }

Task scheduler:
It is responsible for scheduling your tasks. In essence, it's an abstraction that handles the low-level work of queuing tasks onto threads.
The .Net framework provides you with two task schedulers. These include The default task scheduler is based on the.NET Framework 4 thread pool, which provides work-stealing for load-balancing, thread injection/retirement for maximum throughput, and overall good performance. It should be sufficient for most scenarios, and there's another task scheduler that executes on the synchronization context of a specified target.
You can use the TaskScheduler.FromCurrentSynchronizationContext method to specify that a task should be scheduled to run on a particular thread. This is useful in frameworks such as Windows Forms and Windows Presentation Foundation where access to user interface objects is often restricted to code that is running on the same thread on which the UI object was created. For more information, see How to: Schedule Work on the User Interface (UI) Thread.

Hide   Shrink https://www.codeproject.com/images/arrow-up-16.png   Copy Code
        private readonly TaskScheduler uiContextTaskScheduler;

        #region Constructor and Finalizers
        public MainForm()
        {
            InitializeComponent();
            uiContextTaskScheduler = TaskScheduler.FromCurrentSynchronizationContext();
        }
        #endregion

        private static long AddNumbersBetweenLimits(long lowerLimitInclusive, long upperLimitInclusive)
        {
            long sumTotal = 0;
            for (long i = lowerLimitInclusive; i <= upperLimitInclusive; i++)
            {
                sumTotal += i;
            }

            return sumTotal;
        }
      
        private void btnUpdateUiTpl_Click(object sender, EventArgs e)
        {
            ClearResultLabel();
            long degreeofParallelism = 10;
            long lowerbound = 0;
            long upperBound = 0;
            List<Task<long>> tasks = new List<Task<long>>();
            long countOfNumbersToBeAddedByOneTask = 100000; //1 lakh
            for (int spawnedThreadNumber = 1; spawnedThreadNumber <= degreeofParallelism; spawnedThreadNumber++)
            {
                lowerbound = ++upperBound;
                upperBound = countOfNumbersToBeAddedByOneTask * spawnedThreadNumber;
                //copying the values to be passed to task in local variables to avoid closure variable
                //issue. You can safely ignore this concept for now to avoid a detour. For now you
                //can assume that I've done a bad programming by creating two new local variables unnecessarily.
                var lowerLimit = lowerbound;
                var upperLimit = upperBound;

                tasks.Add(Task.Run(() => AddNumbersBetweenLimits(lowerLimit, upperLimit)));
            }
            Task.WhenAll(tasks).ContinueWith(ContinuationAction, tasks, uiContextTaskScheduler);
        }

        private void ContinuationAction(Task task, object o)
        {
            var partialSumTasks = (List<Task<long>>) o;
            var finalValue = partialSumTasks.Sum(eachtask => eachtask.Result);
            lblTotal.Text = "Sum is : " + finalValue;
           }


                   Asynchronous programming
Asynchronous programming is a means in which a unit of work runs separately from the main application thread and notifies the calling thread of its completion, failure or progress.
Asynchronous programming is writing code that allows several things to happen at the same time without "blocking", or waiting for other things to complete.
using async and await keywords achieve Asynchronous programming in .Net.
Async
This keyword is used to qualify a function as an asynchronous function. In other words, if we specify the async keyword in front of a function then we can call this function asynchronously. Have a look at the syntax of the asynchronous method.
public async void CallProcess()
{
}
An async method has the following characteristics:
  • An async method must have the async keyword in its method header, and it must be before the return type.
  • This modifier doesn’t do anything more than signal that the method contains one or more await expressions.
  • It contains one or more await expressions. These expressions represent tasks that can be done asynchronously.
  • It must have one of the following three return types.
    − void :If the calling method just wants the async method to execute, but doesn’t need any further interaction with it
    − Task : If the calling method doesn’t need a return value from the async method, but needs to be able to check on the async method’s state
    − Task<T> :If the calling method is to receive a value of type T back from the call, the return type of the async method must be Task
  • An async method can have any number of formal parameters of any types but it cannot be out or ref parameters.
  • The name of an async method should end with the suffix Async.
  • Otherthan Methods, lambda expressions and anonymous methods can also act as async objects.
 await:
The await expression specifies a task to be done asynchronously. which suspends execution of that async method until the task completes.
Syntax
Await task

Example: 
(Continue Restaurant example)
---Here comes asynchronous model .
In the asynchronous model, we make a small change to our kitchen equipment. Instead of hiring more clerks we have only the one clerk but he doesn’t wait at the critical resources. Instead, the grill and the oven have an input box and an output box each. When the clerk wants to use the grill, for example, he puts the meatball in the input box and goes away to do anything else. When the meatball is ready (after 30 seconds) it drops into the output box and the grill rings a bell. The clerk can pick up the meatball and continue the recipe. The diagram for serving two customers in this model looks like this:

Note the colors: White – the clerk is actually doing some work, Grey – the clerk is not doing anything, Yellow – the machines are doing the actual work.
We can see that the second customer still gets his hamburger after 110 seconds with only one clerk. We can also see that during these 110 seconds the clerk was occupied for 40 seconds and the rest of the time he was waiting for additional customers to come in (we didn’t have that slack time in the multithreaded model).

References:
Async  and await: 

Task vs Thread differences in C#

When we execute things on multiple threads, it’s not guaranteed that the threads are separated across multiple processors.
Task is a lightweight object for managing a parallelizable unit of work. It can be used whenever you want to execute something in parallel. Parallel means the work is spread across multiple processors to maximize computational speed. Tasks are tuned for leveraging multicores processors.
Task provides following powerful features over thread.
  1. If system has multiple tasks then it make use of the CLR thread pool internally, and so do not have the overhead associated with creating a dedicated thread using the Thread. Also reduce the context switching time among multiple threads.
  2. Task can return a result. There is no direct mechanism to return the result from thread.
  3. Wait on a set of tasks, without a signaling construct.
  4. We can chain tasks together to execute one after the other.
  5. Establish a parent/child relationship when one task is started from another task.
  6. Child task exception can propagate to parent task.
  7. Task support cancellation through the use of cancellation tokens.
  8. Asynchronous implementation is easy in task, using’ async’ and ‘await’ keywords.


No comments:

Post a Comment