Wednesday, August 5, 2009

Threads

What Is a Thread?
A thread is a semi-process, that has its own stack, and executes a given piece of code. Unlike a real process, the thread normally shares its memory with other threads. For processes we usually have a different memory area for each one of them.

Thread Group
A Thread Group is a set of threads all executing inside the same process. They all share the same memory, and thus can access the same global variables, same heap memory, same set of file descriptors, etc. All these threads execute in parallel.
The advantage of using a thread group instead of a normal serial program is that several operations may be carried out in parallel, and thus events can be handled immediately as they arrive (for example, if we have one thread handling a user interface, and another thread handling database queries, we can execute a heavy query requested by the user, and still respond to user input while the query is being executed).
Also in a thread group context switching between threads is much faster than context switching between processes in a process group (context switching means that the system switches from running one thread or process, to running another thread or process). Also, communications between two threads is usually faster and easier to implement than communications between two processes.

Disadvantages
On the other hand, because threads in a group all use the same memory space, if one of them corrupts the contents of its memory, other threads might suffer as well. With processes, the operating system normally protects processes from one another, and thus if one corrupts its own memory space, other processes won't suffer.
Multithreading also comes with a resource and CPU cost in allocating and switching threads if used excessively. In particular, when heavy disk I/O is involved, it can be faster to have just one or two workers thread performing tasks in sequence, rather than having a multitude of threads each executing a task at the same time.

Threads in C#
A C# program starts in a single thread created automatically by the CLR and operating system (the "main" thread), and is made multi-threaded by creating additional threads. Here's a simple example and its output:
First import the following namespaces:
using System;
using System.Threading;







The main thread creates a new thread t on which it runs a method that repeatedly prints the character y. Simultaneously, the main thread repeatedly prints the character x. A thread has an IsAlive property that returns true after its Start() method has been called, up until the thread ends. A thread, once ended, cannot be re-started.



How Threading Works
Multithreading is managed internally by a thread scheduler, a function the CLR typically delegates to the operating system. A thread scheduler ensures all active threads are allocated appropriate execution time, and that threads that are waiting or blocked – for instance – on an exclusive lock, or on user input – do not consume CPU time.
On a single-processor computer, a thread scheduler performs time-slicing – rapidly switching execution between each of the active threads. This results in "choppy" behavior, such as in the very first example, where each block of a repeating X or Y character corresponds to a time-slice allocated to the thread. Under Windows XP, a time-slice is typically in the tens-of-milliseconds region – chosen such as to be much larger than the CPU overhead in actually switching context between one thread and another (which is typically in the few-microseconds region).
On a multi-processor computer, multithreading is implemented with a mixture of time-slicing and genuine concurrency – where different threads run code simultaneously on different CPUs. It's almost certain there will still be some time-slicing, because of the operating system's need to service its own threads – as well as those of other applications.
A thread is said to be preempted when its execution is interrupted due to an external factor such as time-slicing. In most situations, a thread has no control over when and where it's preempted.

Passing Data to ThreadStart
The .NET framework defines another version of the delegate called ParameterizedThreadStart, which accepts a single object argument. Consider the example below:




Here method Go is called with a parameter "true".

Naming Threads
A thread can be named via its Name property. This is of great benefit in debugging: as well as being able to Console.WriteLine a thread’s name, Microsoft Visual Studio picks up a thread’s name and displays it in the Debug Location toolbar. A thread’s name can be set at any time – but only once – attempts to subsequently change it will throw an exception.

Foreground and Background Threads
By default, threads are foreground threads, meaning they keep the application alive for as long as any one of them is running. C# also supports background threads, which don’t keep the application alive on their own – terminating immediately once all foreground threads have ended.
A thread's IsBackground property controls its background status.

Thread Priority
A thread’s Priority property determines how much execution time it gets relative to other active threads in the same process, on the following scale:

enum ThreadPriority { Lowest, BelowNormal, Normal, AboveNormal, Highest }

Locking and Thread Safety
Locking enforces exclusive access, and is used to ensure only one thread can enter particular sections of code at a time. For example, consider following class:







This is not thread-safe: if Go was called by two threads simultaneously it would be possible to get a division by zero error – because val2 could be set to zero in one thread right as the other thread was in between executing the if statement and Console.WriteLine.








Only one thread can lock the synchronizing object (in this case locker) at a time, and any contending threads are blocked until the lock is released. If more than one thread contends the lock, they are queued – on a “ready queue” and granted the lock on a first-come, first-served basis as it becomes available. Exclusive locks are sometimes said to enforce serialized access to whatever's protected by the lock, because one thread's access cannot overlap with that of another. In this case, we're protecting the logic inside the Go method, as well as the fields val1 and val2.

Thread State
Below is the thread state diagram. One can query a thread's execution status via its ThreadState property.




























Tuesday, July 7, 2009

UDP Socket Programming in C#

This article is on UDP socket programming. You will find lots of articles and blogs on TCP sockets but not many informative ones on UDP sockets. We use UDP when we have fewer resources, support more clients or in fire and forget scenarios . UDP sockets are pretty simple, but writing efficient UDP socket code isn't. In this article, I will discuss the advantages and disadvantages of using UDP. You will also see a sample application code.

UDP or TCP?
TCP maintains a session, and guarantees reliability. UDP provides no such features. With UDP, no connection is maintained. If a packet is lost using UDP, the application must detect and remedy the situation.
Using UDP a single packet can be sent to multiple machines at once using multicast and broadcast. Multicast can be used to send a message even if we do not know the IP address of the target machine.
Since TCP guarantees that data will be processed by the server in the order it was sent, packet loss on a TCP connection stops the processing of any data until the lost packet is received successfully. For some applications this behavior is not acceptable, but others can proceed without the missing packet. For example, the loss of one packet in a broadcast video should not cause a delay because the application should just play the next frame.
When packet-loss is detected, TCP slows down the rate of outgoing info on a connection. This can result in slow transmission rates on networks with high packet loss.
On the other hand, TCP has distinct advantages in simplicity of use and implementation. Many security threats are resolved in the TCP stack. Also, firewalls cannot easily identify and manage UDP traffic.

Lets go through the code implementing a simple UDP receive and send.
-------------------------------------------------------------------------------
Socket requestSocket = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp);

//Assign the any IP of the machine and listen on port number 1000
IPEndPoint ipEndPoint = new IPEndPoint(IPAddress.Any, 1000);

//Bind this address to the server
serverSocket.Bind(ipeServer);

IPEndPoint ipeSender = new IPEndPoint(IPAddress.Any, 0);

//The epSender identifies the incoming clients
EndPoint epSender = (EndPoint) ipeSender;

//Start receiving data
serverSocket.BeginReceiveFrom (byteData, 0, byteData.Length,
SocketFlags.None, ref epSender,
new AsyncCallback(OnReceive), byteData);

-------------------------------------------------------------------------

With IPAddress.Any, we specify that the server can accept client requests coming from any IP. To use any particular IP, we can use IPAddress.Parse ("IP here"). The Bind function then binds the serverSocket to the specified IP address and Port (this port is the port on which the server is listening). The epSender gives the endpoint of the client from which data is received. Note that epSender is passed as a reference to the BeginReceiveFrom method.

With BeginReceiveFrom, we start receiving async data. Here OnReceive method gets called when a UDP message is received. Look at the definition of OnReceive method below:
-------------------------------------------------------------------------------
void OnReceive(IAsyncResult result)
{
EndPoint remoteEndPoint = new IPEndPoint(0, 0);
try
{
int bytesRead = receiveSocket.EndReceiveFrom(result, ref remoteEndPoint);
byteData = (byte[])result.AsyncState;

if(bytesRead > 0)
{
//Start receiving data again
serverSocket.BeginReceiveFrom (byteData, 0, byteData.Length,
SocketFlags.None, ref epSender,
new AsyncCallback(OnReceive), byteData);
}
}
catch (SocketException e)
{
Console.WriteLine("Error: {0} {1}", e.ErrorCode, e.Message);
}
}

-------------------------------------------------------------------------------
We passed byteData as the last parameter of BeginReceiveFrom method. This byte array now becomes available as part of IAsyncResult. We can pass any object as a parameter in the BeginReceiveFrom method and retrieve it from IAsyncResult.

We can handle the retrieved byte array and use it for processing on the server side. Once all processing is done we can send either a multicast message or send a response to each individual client which sent a request.

Sending a multicast message:
-------------------------------------------------------------------------------
IPAddress multicastGroup = IPAddress.Parse("IP here");

const int ProtocolPort = 3001;

Socket sendSocket = new Socket(AddressFamily.InterNetwork,
SocketType.Dgram, ProtocolType.Udp);


EndPoint sendEndPoint = new IPEndPoint(multicastGroup, ProtocolPort);

sendSocket.SendTo(buffer, bufferUsed, SocketFlags.None, sendEndPoint);

-------------------------------------------------------------------------------

Sending response to each individual client:
-------------------------------------------------------------------------------
Socket requestSocket = new Socket(AddressFamily.InterNetwork,
SocketType.Dgram, ProtocolType.Udp);


EndPoint requestEndPointDestination = new IPEndPoint(((IPEndPoint)remoteEndPoint).Address, request.responsePort);

requestSocket.SendTo(byteData, byteData.length,

SocketFlags.None, requestEndPointDestination);

-------------------------------------------------------------------------------
If you notice when we send a reponse to an individual client, we are able to retrieve the IP address of the client and the port on which it is listening. We send the reponse on the same IP and Port. We can specify a different Port as required.

The above code demonstrated how to receive and send UDP messages. I would also like to highlight that we might face a lot of performance problems with UDP. Please keep in mind the below metioned points if performance is a criteria:
- Turn off firewall on the machine on which the UDP application is running.
- An async method called BeginSendTo is provided in sockets. Use this method
appropriately.
- SendTo method gives better performance.
- Thread spawning for various reasons eats up a lot of time. Avoid using threads where ever possible.
- Use appropriate buffer size.

Saturday, June 27, 2009

Cloud Computing

Cloud computing is a general term for anything that involves delivering hosted services over the Internet. The name was inspired by the cloud symbol which is often used to represent the Internet in flow charts and diagrams. Cloud computing comes into the picture when you think about a way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software.

A cloud service has three distinct characteristics. It is sold on demand, by the minute or the hour, a user can have as much or as little of a service as they want at any given time and the service is fully managed by the provider (the consumer needs nothing but a personal computer and Internet access). Significant improvement in distributed computing and improved access to high-speed Internet and a weak economy, have accelerated interest in cloud computing.

A cloud can be private or public. A public cloud sells services to anyone on the Internet. (Currently, Amazon Web Services is the largest public cloud provider.) A private cloud is a network or a data center that supplies services to a limited number of people. When a service provider uses public cloud resources to create their private cloud, the result is called a virtual private cloud. The goal of cloud computing is to provide easy, scalable access to computing resources and IT services.

Cloud computing services are broadly divided into three categories: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS).

Infrastructure-as-a-Service like Amazon Web Services provides virtual server instances with unique IP addresses and blocks of storage on demand. Customers use the provider's API to start, stop, access and configure their virtual servers and storage. Cloud computing allows a company to pay for only as much capacity as is needed. This pay-for-what-you-use model resembles the way electricity, fuel and water are consumed and is sometimes referred to as utility computing.

Platform-as-a-service in the cloud is defined as a set of software and product development tools hosted on the provider's infrastructure. Developers create applications on the provider's platform over the Internet. Force.com and GoogleApps are examples of PaaS. Currently, some providers do not allow software created by their customers to be moved off the provider's platform.

In the software-as-a-service cloud model, the vendor supplies the hardware infrastructure, the software product and interacts with the user through a front-end portal. SaaS is a very broad market. Services can be anything from Web-based email to inventory control and database processing. Because the service provider hosts both the application and the data, the end user is free to use the service from anywhere.

Here are some advantages and disadvantages of cloud computing:
Cloud computing is a type of on-demand hosting services on the internet. It increases efficiency, is scalable, and lowers expenses. But the monetary savings may be misleading to consumers and businesses who do not fully understand the potential risks involved.

With a pay-as-you-go type structure, users are only charged for the amount of traffic, bandwidth, and memory used. Online businesses become more efficient by only utilizing the storage and space needed, while also being assured capacity for any usage increases. Cloud computing has attracted diverse customers, from popular social networking sites such as Twitter and Facebook, to educational websites of Arizona State and Northwestern University.

Although the risks of cloud hosting vs. dedicated servers are very much the same, cloud computing carries inherent risks as personal identifiable information can be distorted, the specific location of data is unknown, and any issues are especially difficult to investigate as customers share their hosting space.

Because of the high risks involved in cloud computing a financial services group has taken the initiative of offering broad technology and cyber risk insurance to lessen the exposures of data loss, network interruptions, and technology failure in general.

Friday, June 26, 2009

Bing's Bang!


Microsoft’s new search engine Bing is up and live. Searching on Google or Yahoo can sometimes be a frustrating experience, resulting in time consuming browsing through millions of results to find exactly what you want. Although, most of us stop after the second page, but the question remains, can Bing deliver better results? Will the Bing user interface win more users?

The most fascinating things about Bing is its name itself and the user interface. This along with $80million in marketing that Microsoft is investing, will without doubt attract brand new visitors to this search engine plus get some who had previously given up on Live Search to take another look. Results with bing have a good relevancy and the new features it has to offer might make a few of them to hook onto this. But, if you’re expecting Bing to be a Google-killer, change your expectations.

Below is a summary on some new features:

Bing, like Live Search before it, has a variety of specialized search engines. The homepage is crisp and clear and the images change daily.
The result links show a short summary or excerpt from the page on mouse hover.



How does Bing perform? The first impression is that it's very impressively fast. Basic searching is not a problem.

The related searches pane works well when searching for vast topics.

Links at the top is much like google with options to search for images, videos and maps. We also have direct links to MSN and Windows Live.



A very cool feature in the maps section is the driving directions category which gives us step by step direction from one place to another.



Extras link at the top right corner takes us to blogs or lets us specify our search preferences.

However, some very clever touches are evident: Images section is much more advanced than google. It lets you also filter your images on size, shape, color, style and people.



The video search is also excellent which lets you filter on length, screen size, resolution and source. You can play back videos from within Bing, and quickly preview and discover the one you're after.



And a few of the negatives are:
On google, any encyclopaediac search shows wiki results right at the top, but this isnt the case with Bing.

Searching for local businesses only seems to bring up a map, phone number and other information.

Bing doesn't do so well with natural language searches or present you with extended information. "Time in London?" for example, just gives you the search result of sites that will tell you that rather than just telling you.
Just typing in weather, on the other hand, does work – another win for Bing.

The "More" link at the top lies useless. Clicking on it shows links again to Images, Videos, Maps etc.

Windows Live, is still under the search bar on the MSN homepage – even though Live.com now redirects to Bing.

So as you see, sometimes Google beats Bing and sometimes Bing gets the bait!! Make your choice....