Friday, March 29, 2013

c# - StreamReader vs File.ReadLines

Performance testing
Machine Configuration: Dell E6420 4 GB Ram 32- Bit i-7 Core (cores dont matter in this case.Parallel test for later)
Input File: 350+ MB tab delimited txt file

StreamReader


using System;
using System.Diagnostics;
using System.IO;

namespace FileReaderPerf

{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();
            try
            {
                  using (StreamReader oReader = new StreamReader(@"c:\File.txt"))
                  {
                      string sLine = oReader.ReadLine();
                      while (sLine != null)
                      {
                          Console.WriteLine(sLine);
                          sLine = oReader.ReadLine();
                      }
                  }                
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
            stopWatch.Stop();
            TimeSpan ts = stopWatch.Elapsed;

            string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",

                ts.Hours, ts.Minutes, ts.Seconds,
                ts.Milliseconds);
            Console.WriteLine("RunTime " + elapsedTime);
            Console.Read();
        }
    }
}


























StreamReader using a larger buffer size

using (StreamReader oReader = new StreamReader(@"c:\File.txt",Encoding.ASCII,false,8092))
























File.ReadLines

using System;
using System.Diagnostics;
using System.IO;

namespace FileReaderPerf
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();
            try
            {
                 foreach(string sLine in File.ReadLines(@"c:\File.txt"))
                            Console.WriteLine(sLine);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
            stopWatch.Stop();
            TimeSpan ts = stopWatch.Elapsed;

            string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
                ts.Hours, ts.Minutes, ts.Seconds,
                ts.Milliseconds);
            Console.WriteLine("RunTime " + elapsedTime);
            Console.Read();
        }
    }
}


5 comments:

Anonymous said...

A comparison to Filestream and ReadAllLines would be nice too.

Additionally you shouldn't print out always the readed information. I found out in own benchmarks that the time differs too much. So for better benchmark just print out the information in the end.

Furthermore Windows is no realtime so to get more accuracy average result you should perform each way 10x, 100x in row within a loop and divide the time by 10,100 to get the average time per read.

Sincerly another developer :)

Anonymous said...

Thanks for the 10x,100x suggestion,I will test that one out.

Unknown said...

You probably shouldn't output the lines as this can have a big effect on your results. Also please do a couple of runs to figure out the uncertainty of the measurements.

Unknown said...

I just did a little testing myself.

Each test did 105 runs (including 5 warmup runs) on the geonames-cities1000 dataset (about 19.4 MB data).

StreamReader: 72.68 ms (+/- 3.28 ms)
File.ReadLines: 73.19 ms (+/- 3.97 ms)

So File.ReadLines is a bit slower than raw StreamReader, but not significantly slower. This is the expected result since File.ReadLines simply makes an ReadLinesIterator internally which uses a StreamReader and does some extra checking along the way. If you wan't my testing code just let me know and I'll put it up somewhere.

Anonymous said...

For those who are interested in micro-optimization techniques, the absolute fastest way in most cases is by the following:

using (StreamReader sr = File.OpenText(fileName))
{
string s = String.Empty;
while ((s = sr.ReadLine()) != null)
{
//we're just testing read speeds
}
}

Put up against several other techniques, it won out most of the time, including against the BufferedReader.

Here's the article which benchmarks multiple techniques to determine the fastest way.

http://blogs.davelozinski.com/curiousconsultant/csharp-net-fastest-way-to-read-text-files

Definitely worth a look for those interested in the various speed performances on multiple techniques.

_