Derrick Petzold

Open source enthusiast and developer


Comparison of IN, GROUP BY and COUNT using Hibernate, Django and SQLAlchemy

Views: 3,221 Comments: 0

Posted July 7, 2010 by derrick


The other day I wrote about how to do a IN and GROUP BY query using Java's de facto ORM, Hibernate. I thought it would be interesting to see how other ORMs handled the same query. This is the query I want to generate:
SELECT COUNT(*),state FROM download_request WHERE id IN (<id list>) GROUP BY state;
Below is the code, output and SQL generated for the three ORMs.

Hibernate

class HibernateDAO implements ApplicationDAO {
public Map getStateCounts(final Collection ids) {
  HibernateSession hibernateSession = new HibernateSession();
  Session session = hibernateSession.getSession();
  Criteria criteria = session.createCriteria(DownloadRequestEntity.class)
	.add(Restrictions.in("id", ids));
  ProjectionList projectionList = Projections.projectionList();
  projectionList.add(Projections.groupProperty("state"));
  projectionList.add(Projections.rowCount());
  criteria.setProjection(projectionList);
  List results = criteria.list();
  Map stateMap = new HashMap();
  for(Object[] obj: results) {
      	DownloadState downloadState = (DownloadState)obj[0];
       	stateMap.put(downloadState.getDescription().toLowerCase(), (Integer)obj[1]);
  }
  hibernateSession.closeSession();
  return stateMap;
}
public static void main(String args[]) {
    HibernateDAO downloadRequestDAO = new HibernateDAO();	
    Collection ids = new ArrayList();
    for (int i = 1000;  i < 1010; i++ )
        ids.add(i);
    Map stateCounts =  downloadRequestDAO.getStateCounts(ids);
    for (String state: stateCounts.keySet()) {
        System.out.println(state + ": " + stateCounts.get(state));
    }
}
}

Output

failed: 5
downloaded: 1
completed: 4

SQL

select this_.state as y0_, count(*) as y1_ from download_request this_ 
where this_.id in (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) 
group by this_.state

Django

counts = models.DownloadRequest.objects.filter(
    id__in=range(1000, 1010),
).values('state').annotate(Count('state'))
for count in counts:
    print count

Output

{'state': u'FAILED', 'state__count': 5}
{'state': u'COMPLETED', 'state__count': 4}
{'state': u'DOWNLOADED', 'state__count': 1}

SQL

SELECT `download_request`.`state`, COUNT(`download_request`.`state`) 
AS `state__count` FROM `download_request` 
WHERE `download_request`.`id` IN (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) 
GROUP BY `download_request`.`state` ORDER BY NULL

SQLAlchmey

query = session.query(
    func.count(DownloadRequest.state), DownloadRequest.state,
).filter(
    DownloadRequest.id.in_(range(1000,1010)),
).group_by(DownloadRequest.state)
for count in query.all():
    print count

Output

(4L, 'COMPLETED')
(1L, 'DOWNLOADED')
(5L, 'FAILED')

SQL

SELECT count(download_request.state) AS count_1, download_request.state 
AS download_request_state FROM download_request 
WHERE download_request.id IN (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) 
GROUP BY download_request.state
As you can see SQLAlchemy is the most similar to SQL, django's is the briefest and Hibernate (obviously) is the most Java-like. Of the three I'd say I like SQLAlchemy the best as it is the most similar to SQL and me being from an SQL background it is the most natural. However all three get the job done and it is always great to have options.
Download the source.

IN, GROUP BY and COUNT with Hibernate

Views: 15,809 Comments: 0
Tags:

Posted July 2, 2010 by derrick


I needed to make the following SQL query with Hibernate
SELECT COUNT(*), state FROM download_request WHERE id IN (<id list>) GROUP BY state;
and being new to Hibernate it came out a lot differently than how I thought it would. To perform the IN query a Criteria query needs to be created
Criteria criteria = session.createCriteria(DownloadRequestEntity.class)
    .add(Restrictions.in("id", ids))
For the count and order by a Projection needs to be added to the criteria
ProjectionList projectionList = Projections.projectionList();
projectionList.add(Projections.groupProperty("state"));
projectionList.add(Projections.rowCount());
criteria.setProjection(projectionList);
This is the resulting code
public Map getStateCounts(final Collection ids) {
  HibernateSession hibernateSession = new HibernateSession();
  Session session = hibernateSession.getSession();
  Criteria criteria = session.createCriteria(DownloadRequestEntity.class)
	.add(Restrictions.in("id", ids));
  ProjectionList projectionList = Projections.projectionList();
  projectionList.add(Projections.groupProperty("state"));
  projectionList.add(Projections.rowCount());
  criteria.setProjection(projectionList);
  List results = criteria.list();
  Map stateMap = new HashMap();
  for(Object[] obj: results) {
      	DownloadState downloadState = (DownloadState)obj[0];
       	stateMap.put(downloadState.getDescription().toLowerCase(), (Integer)obj[1]);
  }
  hibernateSession.closeSession();
  return stateMap;
}
Something completely different from what I expected. That's what I love about solving problems sometimes the solution is something you might never expect.
Download the source.

derrickpetzold.com

My home on the web. Uses the Django web framework, uwsgi as the WSGI server, nginx as the media server and load balancer, pygments for the syntax highlighting. Author and designer.

crowdtube.tv

crowdtube.tv
CrowdTube.tv streams trending videos to your browser like a never ending tv show. This was Cory Shaw's awesome idea. I was responsible for the backend development.

dmusic.bz

dMusic.bz is Pandora clone written using Django and JavaScript. It uses last.fm to find the music relationships. Author and designer.

ilovephotos.com

ilovephotos.com
ilovephotos.com is a photo sharing and tagging website. Facial detection was run on the photos so the bounding boxes were already drawn around the faces eliminating a step in a tedious process. I authored views and the process to run facial detection on the photos and upload them to S3.

kindfish.com

kindfish.com
Kindfish.com was BlueLava's first photo site. It was event based and slideshows would be created from the albums. I authored the views and process to generate the slideshows from the photos on EC2.

Below is a snippet of my resume. Click here to view the full version in pdf, its proper format.

captcha