Poor man’s performance profiling

If you have been doing development with Xamarin on Android you have probably noticed the total absence of any profiler, there is not even a decent method to get the high resolution time. Sure, there is Environment.TickCount but the resolution is at best in milliseconds. Pretty useless for performance profiling.

Rarely people use performance profilers on the desktop. Mostly because their applications don’t have complicated stuff and the modern CPUs are hiding most of the non performing code. However mobile devices are far more delicate and have less CPU power, less RAM, less resources in general. Thus slow code is much more notable and visible to end users. There is no better way to profile your code with a performance profiler. Just take a look at RedGate’s ANTS Performance profiler. Unfortunately there are no such tools for Xamarin Android platform even though mobile code should be really optimized.

I present  my solution for getting some useable metrics from your code (the code you have sources for). The code I’ve written was used for MvvmCross performance profiling hence it slightly relies on it (logging, IoC) however it can be easily made standalone if necessary.

Getting high resolution time

Java has java.lang.System.nanoTime() method that returns the highest resolution time available on Android. Xamarin doesn’t provide it through .net, you have to write a wrapper by yourself using java reflection.

public static class JavaLangSystem
{
    static IntPtr class_ref;
    static IntPtr id_nanoTime;
    public readonly static long Avg;

    static JavaLangSystem()
    {
        class_ref = JNIEnv.FindClass("java/lang/System");
        id_nanoTime = JNIEnv.GetStaticMethodID(class_ref, "nanoTime", "()J");

        for (int i = 0; i < 10; i++)
        {
            NanoTime();
        }

        long start = NanoTime();
        for (int i = 0; i < 1000; i++)
        {
            NanoTime();
        }
        Avg = (NanoTime() - start) / 1000;
MvxTrace.TaggedTrace("metrics", "Avg NanoTime reflection is {0}", Avg); } [Register("nanoTime", "()J", "")] public static long NanoTime() { return JNIEnv.CallStaticLongMethod(class_ref, id_nanoTime); } }

Note that I do some calibration in static constructor because of reflection the call to nanoTime() itself consumes some time. I store the average call time into a static variable Avg that I'll later use.

Accessing high resolution timer through IoC

First part is to create an interface

public interface IHighResolution
{
    long Ticks { get; }
    long Avg { get; }
}

then implement it

public class HighResolution: IHighResolution
{

    #region IHighResolution Members

    public long Ticks
    {
        get { return JavaLangSystem.NanoTime(); }
    }

    public long Avg
    {
        get { return JavaLangSystem.Avg; }
    }

    #endregion
}

and finally register it to IoC provider.

protected override IMvxApplication CreateApp()
{
    Mvx.RegisterType<IHighResolution, HighResolution>();
    return new App();
}

I register it in MvxAndroidSetup derived class (MvvmCross initialization) but if you are not using MvvmCross feel free to register it with IoC provider of your choice on some other initialization place.

This separation comes specially handy if you are to measure PCL assemblies where nanoTime() isn’t accessible.

 

Define class that will store a single measurement

[DebuggerDisplay("{Name} in {Ticks}")]
public class PerfMeasurement
{
    public string Name { get; set; }
    public List<PerfMeasurement> Children;
    public long Ticks { get; set; }

    public int AllChildren
    {
        get
        {
            if (Children == null || Children.Count == 0)
                return 0;
            else
            {
                int sum =0;
                foreach (PerfMeasurement m in Children)
                {
                    sum += 1 + m.AllChildren;
                }
                return sum;
            }
        }
    }
}

There are two interesting features here. First is Children field which holds nested measurements. And the other is AllChildren property that recursively counts all children.

The main measurement class

public static class Metrics
{
    private static readonly IHighResolution highresolution;
    private static readonly long twoAvg;
    
    public static List<PerfMeasurement> RootData { get; private set; }
    private static Stack<PerfMeasurement> stack;
    //         private static int counter;
    
    static Metrics()
    {
        RootData = new List<PerfMeasurement>(100000);
        stack = new Stack<PerfMeasurement>();
        highresolution = Mvx.Resolve<IHighResolution>();
        twoAvg = highresolution.Avg * 2;
    }
    
    public static PerfMeasurement Push(string name)
    {
        PerfMeasurement head;
        if (stack.Count == 0)
        {
            head = new PerfMeasurement { Name = name };
            RootData.Add(head);
        }
        else
        {
            PerfMeasurement previous = stack.Peek();
            head = new PerfMeasurement { Name = name };
            if (previous.Children == null)
                previous.Children = new List<PerfMeasurement>();
            previous.Children.Add(head);
        }
        stack.Push(head);
        return head;
    }
    
    public static void Pull()
    {
        stack.Pop();
        if (stack.Count == 0)
        {
            PerfMeasurement m = RootData[RootData.Count - 1];
            OffsetParents(m);
            Print(m);
        }
    }
    
    private static void OffsetParents(PerfMeasurement m)
    {
        m.Ticks -= m.AllChildren * twoAvg;
        if (m.Children != null)
        {
            foreach (PerfMeasurement child in m.Children)
                OffsetParents(child);
        }
    }
    
    public static void Print(PerfMeasurement metric)
    {
        StringBuilder sb = new StringBuilder();
        Print(metric, sb, 0);
        MvxTrace.TaggedTrace("metrics", sb.ToString());
    }
    
    public static void Print(PerfMeasurement metric, StringBuilder sb, int depth)
    {
        string prepend = new string(' ', depth * 4);
        sb.AppendLine(string.Format("{0}{1} took {2:#,##0}", prepend, metric.Name, metric.Ticks));
        if (metric.Children != null && metric.Children.Count > 0)
        {
            foreach (PerfMeasurement child in metric.Children)
            Print(child, sb, depth + 1);
        }
    }
}

This class acts like a Stack for measurement items defined in earlier one. When item at the root level is pulled it outputs the results of the whole hierarchy linked to it into the (MvvmCross) logging mechanism (Print methods). But before doing that it recursively subtracts the nanoTime() invocation of all children (OffsetParents - for each children it assumes nanoTime() took two average invocations – hence the static field twoAvg).

That’s all we need. However there is one step that will make measurement a bit easier and more clean.

 

Making it easier to use Metrics with a helper class

public class MeasurePerf : IDisposable
{
    private static readonly IHighResolution highresolution;

    private long start;
    private PerfMeasurement m;

    static MeasurePerf()
    {
        highresolution = Mvx.Resolve<IHighResolution>();
    }

    public MeasurePerf(string name)
    {
        m = Metrics.Push(name);
        this.start = highresolution.Ticks;
        
    }

    public void Dispose()
    {
        m.Ticks += highresolution.Ticks - start - highresolution.Avg;
        Metrics.Pull();
    }
}

This class is intended as a wrapper around measured code using using keyword. Measurement looks like:

using (new MeasurePerf("Your comment here"))
{
    // your code here
    // nesting measuring is perfectly fine
}

Here is a sample output I’ve been looking at:

profiler_results

I highlighted the root item. All items are showing their total time (children included). Note the negative numbers. That happens because this kind of measurement isn’t accurate at all (remember, I am subtracting twoAvgs). However it gives you at least some insight of what code is slow and which child is guilty.

So, that’s it. Until Xamarin gives us profilers there is no better way AFAIK. Unfortunately. There is one thing you can do, though. You can tell Xamarin that we need profilers. Perhaps through my uservoice suggestion.

Building an Intel Core i7 based computer – Chassis

The last ingredient is the chassis. Likewise PSU it looks like you don’t need to be careful when picking one. Wrong. Chassis are very different and details are important. Let’s go by features in no significant order.

Weight

Cheaper chassis are mostly build from steel (at best the front cover is made by aluminum but they are mostly from steel) while the more expensive are built from aluminum. Which means lower price and more weight. I am not sure how much the material affects cooling – aluminum chassis should perform better. Since my chassis will sit under my desk and I will (hopefully) rarely move it I am not concerned with weight. The cooling difference should be minimal. Thus steel for me, aluminum is an option.

Classic or tool-less

Tool-less design means that you won’t have to toy with screws to mount drives and add-in cards. Instead you’ll easily use custom made plastic gizmos to fix them. Otherwise you’ll have to use a screwdriver (not that fixing a motherboard is still a screwing experience nevertheless). Not a big deal but this is a good feature. I’ll go for this, not at all costs though (I’ll install hardware only once).

Drives mounting direction

Some chassis support drive mounting from side. This makes mounting and even more important, changing, drives much easier. In classic way you have to push the drive towards wire-filled motherboard to get it in or out. But if you have a side mounting option the drive replacement and mounting is much easier. A must.

Size

Mid-tower for me. No need for big tower.

Front side connectors and their placement

I like to have them on the top of the chassis and I am looking for: 2 (at least) x USB and Firewire. e-Sata is a welcome option.

Front side buttons placement

Having power on and reset buttons located at front side is not a very good idea. They are easily involuntarily kicked when the chassis is positioned under the desk (on the floor) and when the access to them is too easy. Which is something you want try to avoid, otherwise you (or your kids) might reset your computer in middle of the work and data will be lost. Pay attention to buttons accessibility to avoid such unpleasant surprises.

Internal ventilation

Once more the keywords here are 12cm fan and a slow rotation to avoid making too much noise. I like to have two of them, one on the front side and the other on the back side. Front side fan should be an intake while the back side one should be an outtake. So the air moves through and cools entire chassis content. Furthermore the front side one should be placed right in the front of hard driver to provide an adequate air flow. If the chassis doesn’t come with built-in fans it should have at least adequate holes.

Design

I don’t care. It will sit under my desk and I won’t look at it often.

PSU location

PSU is usually located at the top of the back side of the chassis. However the recent trend seems to favor putting it on the bottom to improve the temperature reduction inside the PSU itself. I thought this is a nice idea and why not – cooler PSU means lower noise. But later, when I was building the computer I’ve found that it isn’t such a good idea after all for various reasons (details in next post).

That’s it more or less. There are plenty of different chassis out there from many different manufacturers and picking a good one might take quite some time. Usually I buy chassis made by Cooler Master and this time is no different. At my price range they are good enough.

I’ve bought a Cooler Master RC-690-KKN1-GP from AGT.si for 83 Euros. Note: this chassis has PSU located on the bottom of the chassis. The delivery process from AGT took more than a month because the chassis wasn’t in stock and distributer took quite a long time to deliver.

See other relevant posts:

Introduction
CPU cooler
The goal
Power Supply Unit
Memory
Storage
CPU
Graphics card
Motherboard