Django ORM group by

Filtering in Django over foreign keys or many to many relations's values has, as a buy product, the possibility of creating duplicates of the object you are actually filtering from. A simple way to avoid this is to group by on the primary key of the model. Django, however, does not have group by in the ORM. There is however a way to make Django add it. By slightly abusing annotation it is possible to add just such a claus.

from django.db.models import Count obj1.objects.filter(obj2__value=1).annotate(Count('pk'))

This will add the group by on the obj1 primary key and avoid duplication on the result set.

Threading local and Django

It is rare for me to find something in Python that does not work as I expect it to. Generally speaking, the way I think seems to match the way Python does things. Thats a great advantage to have when working with the language. However when Python does not act as expected it is hard for to find out what is actually going on, because I have to try and think in ways alien to me. Today I met just such a issue within threading.

I am currently working on a project which involves a lot of aggregation of data. Recently the decision was make the application international. Part of the process was creating international databases and split the aggregated data across these databases. Most examples of using multiple databases use either the app, or the specific model to determine which database to use. My use case was more difficult as it involved selecting the database based on the actual data. If the data source was, for instance, from Spain, the spanish database should be used. The data was streaming in all mangled up so it was not possible to use the source as a way to distinguish the data. Furthermore, the data was not coming through a request (where middleware can be used) but synced in from a remote database via a cron job that the proceeds to aggregate the data. To set up the database routing I created a router class. The problem is, there is no way to actually pass data to the routing class. At the moment the hints parameter contains only the instance, if it exists. So for a newly created model object it's empty. My idea was to try and use the threading.local() to communicate between the aggregation function and the db routing class.

Why I'm not using using

First however let me explain why I have not chose to use using. The problem is that when aggregating, the application creates a few different objects, depending on the data. These model objects are created through proxy analyser classes. Using the manual method would not only involve a lot of code, it will also make debugging difficult.

What didn't work

My first instinct was to thing that in each file I needed To access the data I just needed to add the following lines:

import threading
local_storage = threading.local()

And that local storage would be consistent across the whole thread. Unfortunately, it either wasn't, or the code was running on two separate threads, which I don't think it did. This might be a good time for a disclaimer. I am by no means a threading expert. What I say may be really off base. All I know is what I observed and what did, and did not work for me. I added this code to both the db routing class and the aggregator. The aggregator would add an attribute to local_storage and the db routing would check for it to determine the routing. This attempt failed. After some debugging I found that the object created in the aggregator and the one created at the db router were, in fact, not the same. I figured if I define it one of the two, and import it for the other then surely this would be the same object. It was not. I think it probably has something to do with how import actually work, but I'm not sure. I was starting to get frustrated. Googling around didn't really turn up anything significant. I was all but ready to give up on threading.local when I found the django-tools Threadlocal middleware on github. It was using the mechanism I had in mind to make the request available everywhere. I was quite sure that this code worked (because everything you read on the internet is true, right?). So what was I doing wrong?

What did work

The difference seemed to be in what was actually imported. The middleware was defining the local storage and to access the local storage the middleware module was imported and the local storage is then accessed via functions in the module. I did not really see the difference but figured it was worth a shot. I added a new module that defined the local storage and added getters and setters. To my surprise it actually worked. I have no idea why this method worked while the other failed. I am guessing, as I said, that it has something to do with how threading and import work, and what is passed by reference and what is passed by value. One day I will have to dig deeper into this but for now this will do

Conclusion

The threading.local() object offers a thread safe manner to pass data between different parts of the django application when normal parameter passing is not possible. For it to work properly you need to create a proxy module with a getter and setter (and a deleter) and then import that module to each module that needs access.

Week Lessons 3

Mercurial Bookmarks

If one bookmaker is a direct decedent of another bookmaker, it is not possible to use hg merge between the two, since mercurial does not have a fast-forward option like git. The solution is to simply, update. Do note that it seems impossible to update to a bookmark that is half way through the tree. There is great stack overflow question and answer about this.

Django-compress

page gave a 500 error logs said SuspiciousOperation error. access was denied to file

problem was css compressor if more then 1 file was compressed. Reason was a bad media_root setting

Mixed content

iFrame https in a http parent does not qualify but this is good to know - Mozilla blog post

Check in js if in iFrame

Window top === window self

Django timestamp field

See code

Including your django site in a script

Out of the box django comes with a command line tool that is pretty useful. Just reading the getting started will introduce you to it. One of the options it has is top open an interactive python shell that will allow you to interact with your django app1. If a certain task is done regularly, there is also the possibility of writing your own management commands. Its actually quite easy to do and can be incredibly useful, as it can also be combined with Fabric to automate a lot of work.

That said there are those rare occasions when the need arises to run an independent python script that uses some part of the django code. since django is just python code in a few simple steps you can be hacking away at your custom script. What is needed is to import the settings and the right directories to the python path. Assuming your app is called my_app and that all the django apps are in a folder called apps the following code should do the trick:

import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), 'apps')))
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'

And done. Its now possible to just import apps and other django packages as in a normal django app. What did this do? The first two lines are cleared. The third line adds the parent folder to the python path. this will allow the use of my_app as a package. The fourth line adds the apps folder to the python path. This will allow inclusion with just app names. finally, in line 5 the django settings is set. Of course if your settings file is else where, this needs to change to where it is.

Testing 404 in django

Django’s debugging system is very useful. It helps you narrow down where things are going wrong. For instance if you get a 404, the debugging template will show you all the url matching rules. Very useful for tracking down why an unwarranted 404 was encountered. Its somewhat less useful if you want to test your 404.html template. In this case you will need to turn off debugging.

One side effect of setting debug to false[^1] is that django’s dev server stops serving static files. I was, unfortunately, not aware of this and it took me some time to link the two together. There are two options to solve this:

  1. Use a web server on the development system.
  2. Run the dev server with —insecure.

Option 1 is probably the better option, esspecially if you use the same web server used in production as it improves the similarites with production. The second option is much easier. Its up to you.

Installing PostgreSQL and Sql-Ledger for Django apps on Snow Leopard

At DOP we started working on a django front end for Sql-Ledger with the idea of combining it later on with our ticketing system. Getting it going on my MacBook proved to be more challenging then expected. Most of the problems were easily solvable, but finding the oplossing proved to be tricky. Therefor I have decided to create this simple guide, showing the steps I have taken to get everything going. Step 1: PostgreSQL You can download a binary installer from the PostgreSQL site. It will install Postgre to your /Library folder. This should take care of the Postgre part. Step 2: Sql-Ledger Ledger is written in perl, and requires perl to run. Luckily Snow Leopard comes standard with perl. What it does not come standard with is the perl binding for PostgreSQL. The following 3 commands will take care of that: sudo perl -MCPAN -e "install +YAML" YAML is not necessary but it can be helpful, and lacking YAML might give you problems with DBI. sudo perl -MCPAN -e "install DBI" sudo perl -MCPAN -e "install DBD::Pg" This will install the perl binding for PostreSQL. Download Sql-Ledger from here. As of this writing, the latest version is 2.8.31 Unzip and untar the archive and follow the instructions in the readme file. I did not manage to build using the automated script and had to do everything by hand. There is also this page on the ledger page with a somewhat oldish instructions for OS X. Once Ledger is tested and it is working, its time to get the python binding Step 3: Python binding The pyhton binding is for PostgreSQL. You can get the binding library, called Psycopg2 (don't ask me why) from here. Download and upack it. Installing is done by running: python setup.py build sudo python setup.py install If you are not able to build it might be a problem with your path. Python is trying to link to the PostgreSQL bin files. Try to add it to your path and then to build again. And your done. Enjoy (or not) working with Django-Sql-Ledger

Adventures in Django

For a while my django project was set aside for other things. Last week I got around to continue working on it. I have made a lot of progress and learned quite a lot so I figured it was time for an update post. First and foremost a beta version of the site is already online. I dont know if I wrote about what it actually does so I'll start with that. Its a web application developed for something me and my friends were missing. As a group we go on weekends away quite often. Mostly climbing, mountan biking sonwoboarding or hiking. At the end of every such weekend someone always needs to sit down and calculate exactly how much each person payed and so caulcuate who needs to pay back who. I figured it was time to be away with that, and I made "tripcalc" which is ment to slove that. Everyone who payed something just fill it in and the web app does the math and sends emails to people to let them know who they need to pay to, or who are they getting money from. There were a lot of design desicions I made which I am not sure if they were right or not, but that is for another post. This one is about django, jQuery and deployment. Django has really been a wonderful experience for me. There is rarely a day in which I don't go "Wow that is just great, It can actually do that?!". There are a lot of wonderful things in that framework. So with that say, let me get started.

Django and static files

After a long break I have resumed my side project in django. Last night I came upon a problem. When testing a page outside django, the page was rendered correctly. When it was rendered through django the javascript files were not located.

At this point I have to admit I was a bit fullish and forgot to look at the web server's log to see if the files were correctly served. that was an hour and a half of trying to figure out why the javascript functions were not found. So after smartening up I found out that the javascript files were not found. It seemed that django kept looking for them in the wrong location. Using the example here I eventually got it all to working.

It comes down to this. in the settings.py file there is a reference to the media URL and to the file system path to the media directory. Adding the following code to the urls.py if settings.DEBUG: urlpatterns += patterns('', (r'^tripcalc/media/(?P .*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT}), ) Will make sure that while django is in debug mode, serving of the static files will be done via django. for production sites it is better to let a dedicated web server (such as Apache) serve these static files. And that took care of it. I can now continue with the development. And it also enabled me to call the CSS file, so style is also shown. hurray.

Installing Django on Dreamhost

Dreamhost appears on the django site as one of the django friendly hosting services. Unfortunately, dreamhost does not officially support django. It does not have mod_python installed. Django is instead deployed using FastCGI. Hopefully sometime in the future mod_python will be added. There are a few good guides I have found, that explain how to setup django on a dreamhost account:

  • Jeff Croft has a good guide on his blog
  • Gordon Tillman also has a good informative page
  • The dreamhost wiki also has a guide Between the three of them you can probably find all the information needed for installing django for use on a dreamhost account. I will not repeat what they explain but instead add some from my own experience.

Python

Dreamhost, at the moment of writing, is running python 2.4. Luckily it is possible for you to locally install python. I highly recommend it as it will enable you to setup the python environment exactly the way you want it, and it will make it easier to upgrade to future versions of python. cd ~/soft wget http://www.python.org/ftp/python/2.5.2/Python-2.5.2.tgz tar xvfz Python-2.5.2.tgz cd Python-2.5.2 ./configure --prefix ~/install/dir --enable-shared make make install Where ~/install/dir is the directory you want python installed in. I followed dreamhost's Unix account setup guide and installed it under run. I recommend you do to, as it is easier to have full control over you /usr/local. Also adding setuptools makes future installs easier cd ~/soft wget http://peak.telecommunity.com/dist/ez_setup.py ~/path/to/yourpython/python ez_setup.py This will add the easy_install script which will simplify adding packages to your own python install. The final step is adding the new MySQLdb package cd ~/soft svn co https://mysql-python.svn.sourceforge.net/svnroot/mysql-python/trunk/MySQLdb MySQLdb easy_install MySQLdb This is assuming easy_install is in your PATH. If it is not it needs to be added. Your now ready to install django using your very own Python installation. That is detailed enough so I will move to two problems I met after the installation.

Post Installation Problems

I met with two problems that had frustrated me for a few hours. To save you the future installer some pain here they are in case you experience something similar syncdb after admin activation: This should have been very obvious to me but for some reason it escaped me. After enabling the admin page you must run django-admin.py syncdb in your project home page. What happened was that I ran it before I enabled the admin application. This lead to the creation of the needed tables in the MySQL database, but no tables for the admin application. After enabling the admin application, more tables need to be created to accommodate the new application data. The errors I got gave me the impression that there was an error in the MySQLdb egg, so I reinstalled it, then tried to find some workaround. Eventually I realized that I'm just missing the admin tables. .htacess: This was the real mind bender. I kept getting an internal server error saying that it reached maximum internal allowed redirects. It was obviously a configuration error so I compared my .htaccess with that of the guides and it looked the same. So I looked else, but I kept coming back to the conclusion it has to be in the .htaccess. Yet no matter how often I looked at it I couldn't find what was wrong. I need to point out that I am no expret when it comes to apache and that maybe if I knew more about it this would have been simple, but I didn't so I got some gray hairs before I realized I'm missing a space between the - and the [L]. A bloody white space! I was feeling furious and incredibly stupid at the same time. Django is now up and running, and I like it so far. I sure is a lot nicer to work with then with JSP and servlets. Enjoy your python