Developing on Google App Engine for Production

If you’re considering App Engine as platform for your next big thing, here is potpourri of observations that you might find worth reading. This is not tutorial, and basic App Engine hands-on experience is required. Stuff here is written from experiences in Python environment, for Java mileage may vary. There is also lots of functionality that is not covered here because I didn’t personally use them or they are otherwise well documented.

Queries and Indexes

Applications can use basically only two queries: Get data entity by key or get data entities by range.  In ranged query key can be property value, or composite of property values. Anything that needs more than one filter property and specific order will need composite index definition.

For key ordered queries App Engine supports self-merge for table, but in real life it doesn’t work always very far as when number of entities grow you may eventually hit the error “NeedIndexError: The built-in indices are not efficient enough for this query and your data. Please add a composite index for this query.”. This means that some values are too sparse to filter data efficiently. e.g. you have 30 000 entities and one of the properties you’re querying is boolean flag that is either True or False to every entity.

App engine uses composite keys for  building indexes for  queries that need specific order. Be careful when combining more than one list property values in composite index,  App Engine will build permutations of all key values and even with modest lists you end up with hundreds of index entries.

For example define model with two list properties and timestamp

class MyModel(db.Model):
   created = db.DateTimeProperty()
   list1 = db.StringListProperty()
   list2 = db.ListProperty(int)

Define composite index for the model

- kind: MyModel
  properties:
  - name: list1
  - name: list2
  - name: created
    direction: desc

Put entity

m = MyModel()
m.list1 = ["cat", "dog", "pig", "angry", "bird"]
m.list2 = [1, 2, 3]
m.put()

This would create following reverse and custom index entries

  • 5 for each item in list1
  • 3 for each item in list2
  • 1 for created
  • 5 * 3 = 15 entries for permutations (cat, 1, created), (cat, 2, created), (cat, 3, created), (dog, 1, created), (dog, 2, created), …

Total 15 + 1 + 3 + 5 = 24 entries. This is not much in the example, but if grows exponentially when number of list entries and indexes increases. 3 lists in index each having 10 values would mean 10^3 = 1000 index entries.

Maximum number of index entries is 5000 per entity, and this is shared with implicit reverse property index and explicit custom indexes. For example if you have listproperty that you use in custom index, it can have at maximum ~2500 values because the implicit reverse index will take 2500 and the custom index rest 2500 totalling 5000.

Remember to set indexed=false in property definition if you don’t need to query against property, this saves both space and CPU.

Query latencies are pretty ok,  ~100ms for few dozen entities and you can use IN-operator to make parallel queries. Just make sure that your ‘IN’ queries  do not return lots of overlapping results as that can hurt performance. Direct get by key latencies are very good. (~20ms). Naturally latency increases linearly if your objects are very large, especially with long listproperties.

Text search is in App Engine roadmap and under development. Meanwhile  you can make simple startswith queries against single property or list of strings. Queries are identical in both cases.
Single property startswith query

class User(db.Model):
   name = db.StringProperty()

users = User.all().filter('name >=', query).filter('name <', query + '\ufffd').fetch()

Listproperty startswith query

class SomeModel(db.Model):
   keywords = db.StringListProperty()

models = SomeModel.all().filter('keywords >=', query).filter('keywords <', query + '\ufffd').fetch()

Note that in latter the sort order may not what you wish for as you must sort always first by first inequality filter property, in this example keywords. Just keep in mind the index growth when you add more properties to the query.

Soft memory limit is < 200MB that is reached easily if you’ve large entities, don’t rely that you can do lots of in-memory sorting. Especially Python memory overhead is pretty big. As rule of thumb you can manipulate ~15000 properties per call. (e.g. 1000 entities each having 15 properties). Each element in listproperty is counted as property.

You’ll see often DeadLineExceedError in the logs, nothing you can do to these except to use highly replicated datastore. Just note that it has much higher CPU cost. Curiously frequency of these errors seem pretty constant and independent of the load. Maybe App Engine gives more priority to more popular apps.

Quotas

Depends lot of your application, but at least in my experience the CPU is limiting factor for most of the use cases. This is mainly because you need to do most of the work when inserting new objects instead of when querying them, so even rarely used queries will cost you in every single insert. Queries needs indexes and storing entities with indexes cost API CPU time. Both your own application execution and the API (DB, etc..) execution time is counted in your quota. Be sure to measure and estimate your costs. Putting entities with very large listproperties that use custom indexes can easily cost 25-60seconds of CPU time per entity.

In case combined CPU time (app + api) grows large enough (> ~1000ms) App Engine will warn you in logs that the call uses high amount of CPU and may run over it’s quota. Curiously it makes this same warning even when you have billing enabled but  it wont’ restrict your app in that case however.

Scalability is Latency

App Engine scalability rules are complex but what mostly matters is your average latency. If you’ve only slow request handler (latency > 500ms) app engine will limit your scalability. It’s not bad to have few slow ones, but make sure that the average is somewhere around ~250ms or less. In worst case App Engine refuses to start new instances and queues new requests to already serving instances thus growing the user perceived request latency. You can observe this from App Engine dashboard log entries showing ‘pending_ms’ times.

Note that cpu time is not same thing as latency, for example these two pieces of code have roughly same CPU cost, but latter has only 1/3 of latency

Slow put

 e1.put()
 e2.put()
 e3.put()

Fast put

db.put([e1, e2, e3])

Slow get

 e1 = db.get(key1)
 e2 = db.get(key2)
 e3 = db.get(key3)

Fast get

ents = db.get([key1, key2, key3])

Parental fetch, aka relation index, aka. parent reference

App Engine DB API does not support partial fetch that can be issue if you have very large listproperties in the entities. It’s possible to achieve something similar by using parent keys. For example if you’ve large number of elements in listproperty, you can make key only query againts that property and fetch only the keys. Then get keys parent value and fetch entity you need.

class Message(db.Model):
  text = db.StringProperty()

class MsgIdx(db.Model)
  recipients = db.ListProperty(db.Key)

msg = Message(text="Hello World")
msg.put()

idx = MsgIndex(key_name='idx', parent=msg)
idx.recipients.append(user1.key())
idx.recipients.append(user2.key())
 ...
idx.put()

Query messages where userX is in recipient list, first get keys

keys = MsgIndex.all(keys_only=True).filter('recipients', userX).fetch(100)

query actual message objects

msg = db.get([k.parent() for k in keys])

In this way you avoid serializing the potentially large recipient list completely.

See Brett Slatkin’s presentation for more details.

Transactions

App Engine DB supports  transactions but it’s not possible to implement global consistency, because transaction can only operate objects in single entity group. For example if you have entity A and B that have no parent keys, you can not operate them both in single transaction. Entity group is all entities with same parent root key, entity without parent key is its own group.

Word of warning, when you use transactions all entities with same parent key are locked for transaction (entity group), in general there should not be more than 1-3 updates per second for single entity group or you’ll get lots of transaction collisions retries that will eat your CPU and increase latency. Collision retries are logged as warnings in App Engine console.

Pre-fetch Referenceproperties

Prefetch referenceproperties before accessing them in sequence.
Bad, will trigger separate DB query for user property each time

class Foo(db.Model):
  user = db.ReferenceProperty(User)

foos = Foo.all().fetch(100)
for f in foo:
  print f.user.name

Good, See prefetch_reprop function implementation here.

foos = Foo.all().fetch(100)
prefetch_refprop(foos,  Foo.user)
for f in foo:
  print f.user.name

This will decrease latency and API CPU time significantly

Debugging and Profiling

Standard Python debugger does not work in App Engine development server, but you can use following wrapper and start dev_appserver.py from command line to get into debugger.

def appe_set_trace():
  import pdb, sys
  debugger = pdb.Pdb(stdin=sys.__stdin__,
  stdout=sys.__stdout__)
  debugger.set_trace(sys._getframe().f_back)

API profiling. Define appengine_config.py in your app and define access stats handler in app.yaml.

def webapp_add_wsgi_middleware(app):
  from google.appengine.ext.appstats import recording
  app = recording.appstats_wsgi_middleware(app)
  return app

- url: /stats.*
  script: $PYTHON_LIB/google/appengine/ext/appstats/ui.py
  login: admin

CPU profiling. Define profiling wrapper that dumps the CPU times to the log.

def real_main():
  # Run the WSGI CGI handler with that application.
  util.run_wsgi_app(application)

def profile_main():
  # This is the main function for profiling
  # We've renamed our main() above to real_main()
  import cProfile, pstats
  prof = cProfile.Profile()
  prof = prof.runctx("real_main()", globals(), locals())
  stream = StringIO.StringIO()
  stats = pstats.Stats(prof, stream=stream)
  stats.sort_stats("cumulative")  # time or cumulative
  stats.print_stats(80)  # 80 = how many to print
  # The rest is optional.
  # stats.print_callees()
  # stats.print_callers()
  logging.info('Profile data:\n%s', stream.getvalue());

if __name__ == '__main__':
  main = profile_main
  #main = real_main
  main()

Task TransientErrors

Task add fails often with transienterror, just retry it once more and you should get rarely failed task adds.

try:
   taskqueue.add(...
except taskqueue.TransientError:
   taskqueue.add(..  # retry once more

Misc

Other things.

  • Static files are not served from application environment, your application can not access them programmatically.
  • Urlfetch service has maximum of 10 sec timeout and can do maximum 10 parallel queries per instance. Queries fail occasionally with application error that is usually caused by server timeout. Queries are done from pretty random source ip’s that are shared by all other  engine apps. You can not override header.
  • Naked domains are not supported (like example.com)
  • Memcache lifetime can very short, mere minutes but if your application is popular App Engine might give more priority. Use multi get and set when ever possible.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: