I needed to collect the output from ffmpeg for some profiling. It proved more challenging than I anticipated as ffmpeg writes the data unflushed to stderr making it unreadable using stdio. To get the data the stderr file descriptor has to set to NONBLOCK using fcntl. Here is the resulting Python code.
def encode(filename, callback=None):
cmd = 'ffmpeg -i "%s" -acodec libfaac -ab 128kb ' + \
'-vcodec mpeg4 -b 1200kb -mbd 2 -flags +4mv ' + \
'-trellis 2 -cmp 2 -subcmp 2 -s 320x180 "%s.mp4"'
pipe = subprocess.Popen(
shlex.split(cmd % (filename, os.path.splitext(filename)[0])),
stderr=subprocess.PIPE,
close_fds=True
)
fcntl.fcntl(
pipe.stderr.fileno(),
fcntl.F_SETFL,
fcntl.fcntl(pipe.stderr.fileno(), fcntl.F_GETFL) | os.O_NONBLOCK,
)
# frame= 29 fps= 0 q=2.6 size= 114kB time=0.79 bitrate=1181.0kbits/s
reo = re.compile(r"""\S+\s+(?P<frame>>d+) # frame
\s\S+\s+(?P<fps>\d+) # fps
\sq=(?P<q>\S+) # q
\s\S+\s+(?P<size>\S+) # size
\stime=(?P<time>\S+) # time
\sbitrate=(?P<bitrate>[\d\.]+) # bitrate
""", re.X)
while True:
readx = select.select([pipe.stderr.fileno()], [], [])[0]
if readx:
chunk = pipe.stderr.read()
if chunk == '':
break
m = reo.match(chunk)
if m and callback:
callback(m.groupdict())
time.sleep(.1)
The complete script is located here.
I have been really loving Python lately but after reading this post. I thought it would be a good idea to check out Erlang. I have heard its concurrency and network support is out of this world and being in a knowledge based industry extra knowledge never hurts. This simple code snippet shows how much Erlang differs from the traditional procedural languages.
average(X) -> sum(X) / len(X). sum([H|T]) -> H + sum(T); sum([]) -> 0. len([_|T]) -> 1 + len(T); len([]) -> 0.
average takes at list X who calls sum and len. Both of those are recursive functions that split the list into the first element H and the remainder T. Variables must start with a capital letter and the ‘_’ denotes the result is not used. Notice in this example no temporary variables were used. Talk about putting the “f” in functional. I can’t wait to get to the concurrent stuff.
Here is a Django authentication backend I wrote using Facebook’s amazingly simple Graph API. It logs the user in using their Facebook credentials so you site doesn’t have to worry about creating user profiles, validating, etc. See
http://developers.facebook.com/docs/authentication/
http://developers.facebook.com/docs/authentication/permissions
http://developers.facebook.com/docs/api
http://github.com/facebook/python-sdk/blob/master/examples/oauth/facebookoauth.py
Define the facebook tokens in settings.py and replace
I just wanted a simple wrapper around syslog. The Python logging module is good but it was too heavyweight for what I needed. Here is simple logging class for syslog. It has an optional decorator to provide the function name to syslog which I find useful for debugging.
When profiling it can be useful to log the amount of time that is spent in a function. With Python that is super easy to do with decorators.
#!/usr/bin/python
import time
import syslog
def logtime(func):
def caller(*args, **kwargs):
stime = time.time()
ret = func(*args, **kwargs)
syslog.syslog(
syslog.LOG_LOCAL2 | syslog.LOG_INFO,
'%s=%s\n' % (func.__name__, time.time() - stime))
return ret
return caller
@logtime
def test_func(arg1, arg2=None):
print arg1, arg2
time.sleep(1)
if __name__ == '__main__':
test_func(1, 2)
logtime will log the time spent in the function to syslog.
Jul 14 15:05:01 olomai python: test_func=1.00114893913
The other day I wrote about how to do a IN and GROUP BY query using Java’s de facto ORM, Hibernate. I thought it would be interesting to see how other ORMs handled the same query. This is the query I want to generate:
SELECT COUNT(*),state FROM download_request WHERE id IN (<id list>) GROUP BY state;
Below is the code, output and SQL generated for the three ORMs.
Hibernate
class HibernateDAO implements ApplicationDAO {
public Map getStateCounts(final Collection ids) {
HibernateSession hibernateSession = new HibernateSession();
Session session = hibernateSession.getSession();
Criteria criteria = session.createCriteria(DownloadRequestEntity.class)
.add(Restrictions.in("id", ids));
ProjectionList projectionList = Projections.projectionList();
projectionList.add(Projections.groupProperty("state"));
projectionList.add(Projections.rowCount());
criteria.setProjection(projectionList);
List results = criteria.list();
Map stateMap = new HashMap();
for(Object[] obj: results) {
DownloadState downloadState = (DownloadState)obj[0];
stateMap.put(downloadState.getDescription().toLowerCase(), (Integer)obj[1]);
}
hibernateSession.closeSession();
return stateMap;
}
public static void main(String args[]) {
HibernateDAO downloadRequestDAO = new HibernateDAO();
Collection ids = new ArrayList();
for (int i = 1000; i < 1010; i++ )
ids.add(i);
Map stateCounts = downloadRequestDAO.getStateCounts(ids);
for (String state: stateCounts.keySet()) {
System.out.println(state + ": " + stateCounts.get(state));
}
}
}
Output
failed: 5 downloaded: 1 completed: 4
SQL
select this_.state as y0_, count(*) as y1_ from download_request this_ where this_.id in (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) group by this_.state
Django
counts = models.DownloadRequest.objects.filter(
id__in=range(1000, 1010),
).values('state').annotate(Count('state'))
for count in counts:
print count
Output
{'state': u'FAILED', 'state__count': 5}
{'state': u'COMPLETED', 'state__count': 4}
{'state': u'DOWNLOADED', 'state__count': 1}
SQL
SELECT `download_request`.`state`, COUNT(`download_request`.`state`) AS `state__count` FROM `download_request` WHERE `download_request`.`id` IN (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) GROUP BY `download_request`.`state` ORDER BY NULL
SQLAlchmey
query = session.query(
func.count(DownloadRequest.state), DownloadRequest.state,
).filter(
DownloadRequest.id.in_(range(1000,1010)),
).group_by(DownloadRequest.state)
for count in query.all():
print count
Output
(4L, 'COMPLETED') (1L, 'DOWNLOADED') (5L, 'FAILED')
SQL
SELECT count(download_request.state) AS count_1, download_request.state AS download_request_state FROM download_request WHERE download_request.id IN (1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009) GROUP BY download_request.state
As you can see SQLAlchemy is the most similar to SQL, django’s is the briefest and Hibernate (obviously) is the most Java-like. Of the three I’d say I like SQLAlchemy the best as it is the most similar to SQL and me being from an SQL background it is the most natural. However all three get the job done and it is always great to have options.
I needed to make the following SQL query with Hibernate
SELECT COUNT(*),state FROM download_request WHERE id IN (<id list>) GROUP BY state;
and being new to Hibernate it came out a lot differently than how I thought it would. To perform the IN query a Criteria query needs to be created
Criteria criteria = session.createCriteria(DownloadRequestEntity.class)
.add(Restrictions.in("id", ids))
For the count and order by a Projection needs to be added to the criteria
ProjectionList projectionList = Projections.projectionList();
projectionList.add(Projections.groupProperty("state"));
projectionList.add(Projections.rowCount());
criteria.setProjection(projectionList);
This is the resulting code
public MapgetStateCounts(final Collection ids) { HibernateSession hibernateSession = new HibernateSession(); Session session = hibernateSession.getSession(); Criteria criteria = session.createCriteria(DownloadRequestEntity.class) .add(Restrictions.in("id", ids)); ProjectionList projectionList = Projections.projectionList(); projectionList.add(Projections.groupProperty("state")); projectionList.add(Projections.rowCount()); criteria.setProjection(projectionList); List
Something completely different from what I expected. That’s what I love about solving problems sometimes the solution is something you might never expect.
This is an init script to run spawn-fcgi and php on Ubuntu. Its adapted from Aaron Schaefer’s excellent post on how to run wordpress on nginx – the configuration this site runs on.
To install it follow the instructions below
git clone git://gist.github.com/510245.git fastcgi-php.gist vim fastcgi-php.gist/fastcgi-php # Update with your pathnames chmod 755 fastcgi-php.gist/fastcgi-php mv fastcgi-php.gist/fastcgi-php /etc/init.d/ update-rc.d fastcgi-php defaults /etc/init.d/fastcgi-php start rm -rf fastcgi-php.gist
