Showing posts with label c#. Show all posts
Showing posts with label c#. Show all posts

Friday, February 03, 2012

Fast Conversion of Hex String Into Decimal Number

Today's post will be about performance. More specifically about converting hex string into decimal number faster than using built-in .NET Framework methods.

I will compare performance of three methods used to convert hex into decimal number. Two of those methods are built into .NET Framework.

1) Convert.ToInt32(hexNumberString, 16)
2) int.Parse(hexNumber, NumberStyles.HexNumber);
3) Custom method using pre-populated table. Let us call it TableConvert.

Here's the code for the TableConvert class. This class just illustrates the idea behind pre-populated table - it is not production code.

class TableConvert
  {
      static sbyte[] unhex_table =
      { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
       ,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
       ,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
       , 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,-1,-1,-1,-1,-1,-1
       ,-1,10,11,12,13,14,15,-1,-1,-1,-1,-1,-1,-1,-1,-1
       ,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
       ,-1,10,11,12,13,14,15,-1,-1,-1,-1,-1,-1,-1,-1,-1
       ,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
      };
                                 
      public static int Convert(string hexNumber)
      {
          int decValue = unhex_table[(byte)hexNumber[0]];
          for (int i = 1; i < hexNumber.Length; i++)
          {
              decValue *= 16;
              decValue += unhex_table[(byte)hexNumber[i]];
          }
          return decValue;
      }
  }

The approach uses simple technique of pre-populated table of  hex-to-decimal values and some simple math. To measure performance of these three methods I've wrote small test application that performed conversion in the loop and measured time.

Tests were made using Intel Core i7 2.80 GHz, .NET Framework 4.0. Time measurements were made in ticks using Stopwatch class.

Results:
Hex string: FF01

IterationsTableConvertConvert MethodParse Method
100283563
10000213425935915
1000000191935252252433490

Hex string: 4AB201
IterationsTableConvertConvert MethodParse Method
100262747
10000193027754284
1000000192801269016481308

Conclusions
Not surprisingly Parse method has the worst performance of all methods. While TableConvert method is the fastest. Usually about 1.2 - 1.4 times faster (20%-40%) than Convert Method.

If you ever happen to do a lot of  convert operations of hex string into decimal number and want to perform it as fast as possible - you can use TableConvert method.

Sunday, July 18, 2010

Fastest Way To Retrieve Custom Attributes for a Type Member

In my previous posts (Performance Issues When Comparing Strings in .NET and When string.ToLower() is Evil) string related operations were discussed.

In this post we'll examine performance issues when querying for type member's custom attributes.
Let us define two attributes and a class. Class will have its single method decorated with an attribute. Here's the code:

class FooAttribute : Attribute
{ }

class BarAttribute : FooAttribute
{ }

class Item
{
    [Bar]
    public int Action()
    {
        return 0;
    }
}
Now the question is what is the fastest way to check Action method for Bar custom attribute. Custom attributes can be queried using instance of a type that implements ICustomAttributeProvider interface. In our case we shall use Assembly class and MethodInfo.

The code below queries custom attributes using Assembly class and then using MethodInfo instance. Query operation executes 10000 times and duration is measured using Stopwatch class. Code below also measures time required to check if attribute is applied.

int count = 10000;
Type tBar = typeof(Item);
MethodInfo mInfo = tBar.GetMethod("Action");
//warm up
mInfo.IsDefined(typeof(FooAttribute), true);
object[] attribs = null;
Stopwatch sw = new Stopwatch();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = Attribute.GetCustomAttributes(mInfo, typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute(specific): {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = mInfo.GetCustomAttributes(typeof(FooAttribute), true);
}
sw.Stop();
Console.WriteLine("MethodInfo: {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = Attribute.GetCustomAttributes(typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute(general): {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();
   
sw.Start();
for (int i = 0; i < count; i++)
{
 Attribute.IsDefined(mInfo, typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute::IdDefined: {0}", sw.ElapsedMilliseconds);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 mInfo.IsDefined(typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("MethodInfo::IdDefined: {0}", sw.ElapsedMilliseconds);
sw.Reset();
Code above produces the output:
Attribute(specific): 137, Found: 1
MethodInfo: 130, Found: 1
Attribute(general): 569, Found: 1
Attribute::IdDefined: 40
MethodInfo::IdDefined: 33
Results indicate that the fastest method is querying custom attributes via MethodInfo class. To generalize the results above we can say that the fastest way to get custom attributes - is to use the closest reflection equivalent of type member. (e.g. Method - MethodInfo, Property - PropertyInfo etc)

Last two results show the time of IsDefined operation. Use this operation in cases when only a check is needed whether attribute is applied to a type member.

Thursday, July 08, 2010

Type inference in generic methods

Did you know that in .NET generic methods have type inference? It can also be named as implicit typing.

Let's see how type inference looks in code. In the sample below there is a class with generic methods

class NonGenericType
    {
        public static int GenericMethod1<TValue>(TValue p1, int p2)
        {
            Console.WriteLine(p1.GetType().Name);
            return default(int);
        }

        public static TValue GenericMethod2<TValue>(int p1, TValue p2)
        {
            Console.WriteLine(p2.GetType().Name);
            return default(TValue);
        }

        public static TReturn GenericMethod3<TValue, TReturn>(int p1, TValue p2)
        {
            Console.WriteLine(p2.GetType().Name);
            Console.WriteLine(typeof(TReturn).Name);
            return default(TReturn);
        }
    }
Here's the traditional way of using the above defined methods:
NonGenericType.GenericMethod1<string>("test", 5);
NonGenericType.GenericMethod2<double>(1, 0.5);
Nothing fancy here, we specify what type to place instead of TValue type parameter.
Type inference gives us the possibility to omit directly specifying type parameters. Instead we just use methods as if they're non generic.
NonGenericType.GenericMethod1("test", 5);
NonGenericType.GenericMethod2(1, 0.5);
Type inference can become handy as it reduces typing, but in my opinion it makes code less readable.
Also type inference cannot "guess" the return type of the method:
NonGenericType.GenericMethod3(1, 0.5);
 //error CS0411: The type arguments for method 'TypeInference.NonGenericType.GenericMethod3<TValue,TReturn>(int, TValue)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
Nice explanation why inference does not work in the scenario above was given by Eric Lippert
Happy coding :)

Friday, June 18, 2010

Thread Safe Collection Iteration Techniques

Under multithreaded environment every operation should be tested and analyzed from the viewpoint of thread-safety. That is check every data structure what will happen if it is accessed/changed from multiple threads

Imagine, we need to iterate over a collection of items and perform some actions over each item of the collection. Since we're talking about threading - iteration should be done in a thread safe way. That is while we are iterating over collection no other thread is allowed to add or remove items from it.
No problemo! you may think - do the iteration under a lock.

But it is not that simple.

Code sample below illustrates two approaches how to do the iteration. Both have pros and cons. More on that after the code sample.

int initialItems = 5;
ICollection<string> coll = new List<string>(initialItems);

for(int i = 1; i <= initialItems; i++)
 coll.Add("item" + i.ToString());
   
//#1 iterating with lock approach
lock(coll)
{
 foreach(string item in coll)
 {
  PerformWorkWithItem(item);    
 }
}
//

//#2 iteration over a copy 
ICollection<string> copyColl = null;
lock(coll)
{
 copyColl = new List<string>(coll);
}

foreach(string item in copyColl)
{
 PerformWorkWithItem(item);    
}
//    

void PerformWorkWithItem(string item)
{
 //
 // perform operations that can take some 
 // considereable amount of time     
 //
}

Welcome back.

Approach #1 uses global lock for iteration. That means that while iterating collection is protected by the lock.
The pros are:
  • simplicity (just put the lock and do the job)
  • Memory efficiency - no new object are constructed
The cons are:
  • if PerformWorkWithItem takes long time to complete or is blocking (i.e. reading data from the network) access to collection is blocked for considerable amount of time
  • action with a collection item is also protected by the lock

Approach #2 uses different technique. It locks access to the collection only to perform a copy (snapshot) of the original collection. Iteration and PerformWorkWithItem action is made over a snapshot and is not protected by the lock.
The pros are:
  • Operations on collection items are done without locking the collection. If PerformWorkWithItem takes long time to complete original collection is not locked as in #1
  • Allows to schedule actions on collection items using separate thread
The cons are
  • If original collection is large enough performing data copy can become inefficient
  • Add complexity. While performing actions on snapshot items of the original collection may have been already changed.

Now that we know pros and cons of these two approaches we can deduce some hints that can help choose appropriate technique.

For instance, if PerformWorkWithItem action is relatively fast and there is no problem for the rest of the application to wait for iteration process then approach #1 is the best.

On the other hand if PerformWorkWithItem can take considerable amount of time and other parts of the application frequently access the collection (i.e. it is not desirable to block access to the collection for a long time) then #2 can do.

P.S. There also exists an approach #3. It utilizes lock-free data structures. But it is a whole new story and a topic for separate post.

Friday, April 16, 2010

Refactoring code with lambda expressions

Without much ado lets go straight to the code that needs to be refactored:

bool SomeMethod(long param)
{
   //
   // some prefix code
   //
   try
   {
      //do specific job here
      return DoSpecificJob(param);
   }
   finally
   {
      //
      // some suffix code
      //
   }
}

Result SomeOtherMethod(string name, int count)
{
   //
   // some prefix code
   //
   try
   {
      return DoOtherSpecificJob(name, count);
   }
   finally
   {
      //
      // some suffix code
      //
   }
}
The question is how we can bring prefix and suffix code from the example above in one place (method) without changing code logic.
The goal is to have these two methods rewritten like this:
bool SomeMethod(long param)
{
   return DoSpecificJob(param);
}

Result SomeOtherMethod(string name, int count)
{
   return DoOtherSpecificJob(name, count);
}
While another method will be created that executes prefix and suffix code.

There are several ways how to do that:
1. Create method that contains prefix and suffix code, accepts Delegate object and params object[] array
object ExecuteCommon(Delegate d, params object[] args)
{
   //
   // prefix code
   //
   try { return d.DynamicInvoke(args); }
   finally
   {
      //
      // suffix code
      //
   }
}

bool SomeMethod_First(long param)
{
   Delegate d = new Func<long, bool>((b) => SomeMethod(b));
   return (bool)ExecuteCommon(d, new object[] {param});
}

Result SomeOtherMethod_First(string name, int count)
{
   Delegate d = new Func<string, int, Result>((n, c) => SomeOtherMethod(n, c));
   return (Result)ExecuteCommon(d, new object[] { name, count });
}
The approach looks nice but has several caveats. The problems here are: boxing (wrapping value types into reference types) and casting.
2. Move repeated code up on the call stack
This approach is possible if SomeMethod and SomeOtherMethod are on the same call stack level or called from the same method.

3. Create a generic method that accepts generic delegate and defines several parameters
TResult ExecuteCommon<T1,TResult>(Func<T1, TResult> func, T1 param1)
{
   //
   // some prefix code
   //
   try { return func(param1); }
   finally
   {
      //
      // some suffix code
      //
   }
}

TResult ExecuteCommon<T1, T2, TResult>(Func<T1, T2, TResult> func, T1 param1, T2 param2)
{
   //
   // some prefix code
   //
   try { return func(param1, param2); }
   finally
   {
      //
      // some suffix code
      //
   }
}

bool SomeMethod_Third(long param)
{
   return ExecuteCommon<long, bool>((p) => SomeMethod(param), param);
}

Result SomeOtherMethod_Third(string name, int count)
{
   return ExecuteCommon<string, int, Result>(
      (n, c) => SomeOtherMethod(n, c), name, count);
}
This method does not use casting and there is no boxing present when value type parameters are specified. However, the caveat is that you need to define multiple methods with variable type parameters count. In my opinion the third approach is the best although we have to define two methods that execute prefix/suffix code.

P.S. Another way of refactoring here is code generation: emitting code on the fly or using template tools like T4 templates in Visual Studio.

kick it on DotNetKicks.com

Thursday, December 17, 2009

Mono: C# compiler bug with property inheritance

The bug appeared quite unexpectedly.
In Visual Studio code sample below compiled fine. But doing the same with Mono C# compiler results in error: Compiler Error CS0546: 'Derived2.Accessor.set': cannot override because 'Base.Accessor' does not have an overridable set accessor.
It must've been a bug with the compiler I thought to myself. Mono bugzilla search confirmed that bug was there.

Here is the code sample that produces error report (workaround is described under it):

abstract class Base
    {
        public virtual string Accessor
        {
            get { return "base"; }
            set { }
        }
    }

    class Derived1 : Base
    {
        public override string Accessor
        {
            get { return base.Accessor; }
        }
    }

    class Derived2 : Derived1
    {
        public override string Accessor
        {
            set { }
        }
    }
Workaround for this error is to add set property to Derived1 class:
public override string Accessor
        {
            get { return base.Accessor; }
            set { base.Accessor = value; }
        }
Happy coding :)

Thursday, May 07, 2009

Check If Local Port Is Available For TCP Socket

From time to time we need to check if specified port is not occupied. It can be some sort of setup action where we install server product and want to assure that tcp listener will start without any problems.

How, to check if port is busy - start listening on it.

Version #1

bool IsBusy(int port)
{
Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream,
ProtocolType.Tcp);
try
{
socket.Bind(new IPEndPoint(IPAddress.Any, port));
socket.Listen(5);
return false;
} catch { return true; }
finaly { if (socket != null) socket.Close(); }
}
If another process is listening on specified address our code will return false. This will make our code think that port was free while it was not. Remember, we are checking port availability. We need exclusive access to the port.

Version #2
bool IsBusy(int port)
{
Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream,
ProtocolType.Tcp);
try
{
socket.SetSocketOption(SocketOptionLevel.Socket,
SocketOptionName.ExclusiveAddressUse, true);

socket.Bind(new IPEndPoint(IPAddress.Any, port));
socket.Listen(5);
return false;
} catch { return true; }
finaly { if (socket != null) socket.Close(); }
}
This version of the code is much better. It tries to bind the endpoint with exclusive access. If some other process is listening on the port or established connection is bound to the port an exception will be thrown.

And, of course, there is another way how to perform the check. We shall use classes from System.Net.NetworkInformation namespace
bool IsBusy(int port)
{
IPGlobalProperties ipGP = IPGlobalProperties.GetIPGlobalProperties();
IPEndPoint[] endpoints = ipGlobalProperties.GetActiveTcpListeners();
if ( endpoints == null || endpoints.Length == 0 ) return false;
for(int i = 0; i < endpoints.Length; i++)
if ( endpoints[i].Port == port )
return true;
return false;
}

Monday, April 27, 2009

Discovering System Endianess


When doing network programming we send bytes to and from peers. These bytes sometimes constitute complex protocols.

Let us assume we have simple message exchange protocol with some header and some data. Like in the picture here.

Our prefix contains the size of the message. When peer receives bytes from network it reads header first (which is fixed length) then decodes information from the header (data size etc) and finally tries to read the specified number of bytes from the network.

Network applications usually have at least two peers. These peers can be hosted on different systems. Say, peer1 is working on Windows OS, while peer2 - is Java app working in Unix environment.
Our protocol contains integer value that holds message's data size. This value is 4-bytes long.

Windows OS works with numbers that are considered to be little endian, that is least significant byte is placed on the lowest address. Vice versa for Java number. This division is known as endianess.

To transfer multi-byte values over network in uniform manner an agreement was established. Multi-byte data is written into network in big endian byte order.

Earlier I have said that Windows is little endian system, do you really believe me?
If you do not - check yourself, here's how you can do this in C#.

First approach:

int number = 0x00000001;
byte[] bytes = BitConverter.GetBytes(number);
bool isBigEndian = bytes[0] == 0x00;
Second approach (for geeks):
int number = 0x00000001;
int* p = &number;
bool isBigEndian = p[0] == 0x00;
And finally the third one:
bool isBigEndian = !BitConverter.IsLittleEndian;
When you want to write self-contained code, the above approaches can be used to determine endianess of the system your code operates on.

P.S. Third method is the best :).

Monday, November 03, 2008

Bit Flags: The Simple Way


Time from time we face the need or (for some of us) an opprotunity to mess with the bit fields. As we all know bytes consist of bits. Bit can have two values "0" and "1".

Using this knowledge we can store several flags under single byte. In one byte of information we can efficiently store eight one bit flags (1 byte has 8 bits). If we use int values, which has four bytes - we can store 32 bit flags (4*8=32).

So, lets define 8 bit flags. We should receive following picture.

00000001 - 1-st flag
00000010 - 2-nd flag
00000100 - 3-rd flag
00001000 - 4-th flag
00010000 - 5-th flag
00100000 - 6-th flag
01000000 - 7-th flag
10000000 - 8-th flag

There are two ways how to define these flags in C#. Here is the first way: convert each binary number into hexadecimal representation (decimal can be also used). The result will be like this:
[Flags]
public enum FirstApproach : byte
{
First = 0x01,
Second = 0x02,
Third = 0x04,
Forth = 0x08,
Fifth = 0x10,
Sixth = 0x20,
Seventh = 0x40,
Eighth = 0x80,
}
We obtained rather compact definition. But if someone cannot convert binary (01000000) into hex (0x40) number very fast - she will have to use some special tool like calc.exe :). I consider the above mentioned way little bit tiresome.

Here is the second approach: we define "base" at first.
[Flags]
public enum DefBase : byte
{
First = 0,
Second = 1,
Third = 2,
Forth = 3,
Fifth = 4,
Sixth = 5,
Seventh = 6,
Eigth = 7,
}
Then, using "base" we can define our flags:
[Flags]
public enum SecondApproach : byte
{
First = 1 << (byte)DefBase.First,
Second = 1 << (byte)DefBase.Second,
Third = 1 << (byte)DefBase.Third,
Forth = 1 << (byte)DefBase.Forth,
Fifth = 1 << (byte)DefBase.Fifth,
Sixth = 1 << (byte)DefBase.Sixth,
Seventh = 1 << (byte)DefBase.Seventh,
Eighth = 1 << (byte)DefBase. Eigth,
}
Using 2-nd approach does not involve conversion from binary to hex. Using this method we can reuse DefBase multiple times to create required bit flags. Defining next bit flag is no more pain.

If application will have a lot of bit flags declarations then it is more usefull to use 2-nd approach as it can save time of not using additional tools.