Problem that I debug Yesterday “Service degradation”

In the last few days, we were facing a lot of performance degradation in the production environment. So, from the tools that we use to monitor traffic and performance, we were able to track down the end point which is responsible for the service degradation as well we can see that the end point is making too many queries to the database. From New relic (another tool) we were able to get the query which is making this NOISE.
Now, we need to find out the buggy code which is responsible for the inflated number of database queries. To Debug this, we count the number of queries made to Database when we hit that end point. In order to do that we added this code at the start and end on that api endpoint.
from django.conf import settings
settings.DEBUG = True
from django.db import connection
Model.objects.count()
print(len(connection.queries))

I loaded my local environment with a lot of data and hit that end point. Almost 2000 queries were made to the database that ensured we were able to reproduce issues on the local environment. My guess was that the end point has a buggy piece of code which is making queries to the database in a loop.

Now the next challenge was to find that particular buggy code, my guess was that it is somewhere in the serializer but for the confirmation I print the length of the connection queries in every other line of the end point. When I printed the result on the terminal, I was able to get the line of code as before the execution of line count of queries made to the database were just 5 and just after execution of that line count of queries increased to 1800.

As per my guess it was somewhere in the serializer but when I printed the no. of queries at the constructor of the serializer I was shocked as at that time it was already 1800 queries made to the database.

I use python debugger to get api journey to serializer constructor from get_serializer_class function in the end point. So after printing queries on every line I found a buggy line inthe get_serializer_context function.

In that function we used .only() in a django query followed by a for loop. We were fetching only 3 fields as id, uid and name in the .only() as model.objects.only(“id”, “uid”, “name”) and below in the loop we were asking for more attributes from object as “city and postal_code”.

So we added city and postal_code to the above query & and replace the .only() by .values_list
like model.objects.values_list(“id”, “uid”, “name”, “city”, “postal_code”, named=True).

The idea behind doing so was that .only() opens a gate of result to be a queryset of an object so we are able to fetch more attributes from it. So fetch city and postal_code were making database queries in a loop. By converting it to a values_list or values will close that gate and to fetch more values we will need to add more fields in values_list. 

As django supports lazy loading, so each time we fetch a attribute value from an object a hit to db is made as 

object = Model.objects.get(id = 1)
print(object.city)  // this one hit to db
print(object.postal_code)  // this another hit to db

When we tested it again after the changes the count of queries were reduced to 25 from 1800. So, we should use less only and more values or values list.

Leave a Comment

Your email address will not be published.