Trace Reports for Performance Tuning in Appengine

Fixing webapp speed is really hard job, mostly because it’s hard to find a bottleneck. And I want to show some tools that Google Appengine gives you for this job. Actually i’m going to tell about combination of two tools, that works just perfectly together.

First thing is Traces (under Monitoring tab in Cloud Console). It’s kind of a new tool, and I didn’t pay much attention earlier, just played a little. I thought it’s just another view to your logs, from Appengine APIs point of view:

gae traces - details - full

It shows you details about requests, with information about which API calls were made, how much time server spent on them, how much it did cost, etc. Pretty useful information btw.

More interesting is «Analysis report» of this traces, that could show you overall stats, latency distribution through some period of time, show bottlenecks, with sample traces, etc. You can get logs for slowest request, and see what’s going on here.

But what makes it really great, is compatibility with another thing, with «Traffic Splitting». This is an old feature, that allows you to run two different versions of app for different users. Mostly used for marketing, it’s what called “A/B testing” also.

But turns out it’s a great thing for performance tuning also. Yeah. I mean for validating if your changes are making any improvement or not. With Traffic Splitting + Traces you could compare two versions of same app, under similar load, on real data, etc. You may select two versions, select an URL to analyze, and then it will make a really good comparison report, will show how much your new code improved app speed, with overall stats, etc.

Basically you have an existing code deployed and a modified version, and you setup a traffic split, 80/20 for example, by clicking “Enable Traffic Splitting” on Appengine Version tab:

gae traces - add split

Then you have to wait until you get enough requests to analyze, minimum is 100 requests to target URLs, for each of this versions. It isn’t limited to timeframe when both versions are deployed, you just need this 100 request for a lifetime of app version, so must likely existing code does have enough data already, and you need to gather data only for new app version. After you receive enough data, you can run Analysis Report and get very nice report:

gae traces - comparison

As you see from this report, you got much improved latency, for 90% of requests latency dropped from 3487 ms to 804 ms, or you made it 77% faster.

That gives your real numbers, not just «ok, this extra caching should help», but real «90% of request are taking 10% less time» (or maybe «7% more time»?). You know, you cannot improve what you cannot measure.


Igor Artamonov

Professional software developer since 2001, have been writing code since 1995. Data processing for Cloud, Ethereum & Blockchain