16 November 2009 ~ 17 Comments

Using C, convert a dynamically-allocated int array to a comma-separated string as cleanly as possible

EDIT: There are no “dynamic arrays”, so to speak, in C. What I meant was “dynamically-allocated”. I’ve updated the wording to reflect this.

EDIT 2: Someone on Reddit pointed out that my Python example doesn’t actually work, since I have an array of ints rather than strings. I’ve updated the code example so it works.

I’m much less experienced in C than I am in higher-level languages. At Cisco, we use C, and I sometimes run into something that would be easy to do in Java or Python, but very difficult to do in C. Now is one of those times.

I have a dynamically-allocated array of unsigned integers which I need to convert to a comma-separated string for logging. While the integers are not likely to be very large, they could conceptually be anywhere from 0 to 4,294,967,295 In Python, that’s one short line.

my_str = ','.join([str(num) for num in my_list])

How elegantly can people do this in C? I came up with a way, but it’s gross. If anyone knows a nice way to do it, please enlighten me.

Continue Reading

Tags: , , ,

05 May 2009 ~ 6 Comments

Non-painful email on Django development servers

I’ve been actively learning and using Django since August 2008, and I’ve loved almost every bit of it. There are plenty of places to read all about the virtues of Django, so I’ll leave that out for now.

One thing that’s always bugged me about web development in general is the sending of emails. I do development on my local computer (with a badly set up Apache / MySQL / PHP / Python / whatever else stack), and I’ve never felt like dealing with the headache of setting up a mail server. This means, when I add something that’s supposed to send an email (like an activation email after registration), I have to get very hacky to test and debug it (making sure the email text is being produced correctly, making sure it’s being sent to and from the right people, etc.).

This was one of the few web development pains that I thought Django didn’t solve. Whenever I’d test a bit of code that was supposed to send email, I’d get a “Connection refused” error page (meaning my computer has no mail server to send the email with). I would usually add in a bit of printf debugging to make sure the subject and body had the correct text, but beyond that, I’d usually wait to test the email portions until I uploaded to a server that could send email (usually the production server, unfortunately).

Yesterday, I bumped into a little section in the Django documentation that explains how to get around this. As usual, Python has all the solutions. First, set this code in your settings.py file:

EMAIL_HOST = 'localhost'
EMAIL_PORT = 1025 # replace this with some free port number on your machine

Then, assuming you’re on a Unix system (I’m on a Mac), run the following on the command line to start a “dumb” Python mailserver:

python -m smtpd -n -c DebuggingServer localhost:1025

Make sure to replace 1025 with whatever you filled in for EMAIL_PORT.

Now, try running the email-sending code in your Python application. Voila! No error pages (or at least, none related to email), and the full text of the email (headers and all) appears in whatever command line prompt you ran the dumb mailserver on. This allows you to the see senders, recipients, subject, and body of the email being sent out, all without getting hacky or sending to an email account you own.

Taking this a step further, I created a small bash script called “dumbmail” in /usr/local/bin that looks like the following:

#!/usr/bin/env bash
if [ -z $1 ]
then port=1025
else port=$1
fi

echo "Starting dumb mail server on localhost:$port"
python -m smtpd -n -c DebuggingServer localhost:$port

Now, when I’m testing a Django application and I get to a section that is going to send an email, I just run “dumbmail” (or “dumbmail some_number” if I need to use a different port, for some reason I can’t imagine), and I’m ready to go.

Hope this helps people. The documentation was always there – I just never noticed that part until yesterday.

Continue Reading

17 March 2009 ~ 3 Comments

Benchmarking Python decimal vs. float

I’m writing a web app that includes, among other things, a good amount of (rational) non-integer numbers. Whenever I’m in this situation, and I’m using a language that supports Decimals (as opposed to just floats and doubles), I always wonder which one I should use.

If you’re a programmer, you understand the difference and the dilemma. Floats/doubles are very fast, as all computers (built within the last 15 years) have hardware specifically made to deal with them. However, they’re not perfectly accurate; because of binary representation, numbers that we use a lot (like 1/10 or 0.1) cause the same problems that 1/3 (0.33333…) cause in base 10.

Decimals, on the other hand, are slow. They are handled entirely in software, and thus take hundreds of instructions to do things that would take less than 10 with floats/doubles. The upside is that they’re perfectly accurate; 0.1 is 0.1 is 0.1.

So the question becomes twofold:

  1. Do I really need my numbers to be perfectly accurate?
  2. How much slower are decimals than floats/doubles?

In my case, the accuracy would be nice, but not completely necessary. And thus, the latter question becomes important. I’m not writing a large application, and I don’t expect it to get too popular too quickly, so if the slowdown is only moderate, I’ll take the accuracy.

To learn what the slowdown was, I wrote two quick Python test programs:


# Decimal test
 
from decimal import Decimal
 
a = 0
for i in range(0, 20000):
    a = Decimal('%d.%d' % (i, i))
    print(a)


# Float test
 
from decimal import Decimal # kept this in on the float version
                            # so they'd have the same overhead
 
a = 0
for i in range(0, 20000):
    a = float('%d.%d' % (i, i))
    print(a)

When I ran each of these with /usr/bin/time (which I just learned about a couple weeks ago, and has replaced counting seconds on my fingers as my favorite benchmarking tool), the decimal version took an average of about 1.5 seconds (over 10 runs), while the float version took an average of 0.5. Just to make sure no overhead was getting in the way, I upped the limit to 40000 and ran it again. Decimal took 3.0 seconds, float 1.0. I can now confidently say that Python floats are about 3x the speed of Python decimals.

Or are they? While this tests the creation and printing of decimals and floats, it doesn’t test mathematical operations. So, I wrote two more tests. I’m going to be doing a lot of division on these numbers, and that’s definitely the most expensive mathematical operation to compute, so I made sure to do it in the tests (along with some subtraction).


# Decimal version
 
from decimal import Decimal
 
a = 0
for i in range(2, 20002):
    a = Decimal('%d.%d' % (i, i)) / Decimal('%d.%d' % (i - 1, i - 1))
    print(a)


# Float version
 
from decimal import Decimal
 
a = 0
for i in range(2, 20002):
    a = float('%d.%d' % (i, i)) / float('%d.%d' % (i - 1, i - 1))
    print(a)

This time, the float version averaged about 0.6 seconds (1.15 with 40,000 iterations instead of 20,000), while the decimal version averaged over 11 seconds (23 with 40,000 iterations instead of 20,000). So while Python float creation and printing is merely 3x as fast as Python decimal creation and printing, Python float division is almost 20x as fast as Python decimal division.

So what did I choose? Decimals. In the context of these tests, the decimal slowdown may seem significant, but if I finished my app using decimals and profiled it, I can almost guarantee (based on the speeds here) that the bottleneck would not be decimal division performance. If I was running an app that was handling hundreds of simultaneous requests, I may consider switching (I may also spring for better hardware, but that’s a different topic). However, for my purpose, 1/20th the speed of floats is more than fast enough.

P.S. As my very late discovery of /usr/bin/time should suggest, I’m extremely new to benchmarking. If anyone has any suggestions for me, or criticisms of my method, please leave your thoughts. This is something I’d like to get better at.

Continue Reading

09 December 2008 ~ 0 Comments

Great Analogy for Entrenched Social Norms

I’m stealing this analogy from a blog post by core Django contributor James Bennett. I think it’s brilliant:

There’s an old joke, so old that I don’t even know for certain where it originated, that’s often used to explain why big corporations do things the way they do. It involves some monkeys, a cage, a banana and a fire hose.

You build a nice big room-sized cage, and in one end of it you put five monkeys. In the other end you put the banana. Then you stand by with the fire hose. Sooner or later one of the monkeys is going to go after the banana, and when it does you turn on the fire hose and spray the other monkeys with it. Replace the banana if needed, then repeat the process. Monkeys are pretty smart, so they’ll figure this out pretty quickly: “If anybody goes for the banana, the rest of us get the hose.” Soon they’ll attack any member of their group who tries to go to the banana.

Once this happens, you take one monkey out of the cage and bring in a new one. The new monkey will come in, try to make friends, then probably go for the banana. And the other monkeys, knowing what this means, will attack him to stop you from using the hose on them. Eventually the new monkey will get the message, and will even start joining in on the attack if somebody else goes for the banana. Once this happens, take another of the original monkeys out of the cage and bring in another new monkey.

After repeating this a few times, there will come a moment when none of the monkeys in the cage have ever been sprayed by the fire hose; in fact, they’ll never even have seen the hose. But they’ll attack any monkey who goes to get the banana. If the monkeys could speak English, and if you could ask them why they attack anyone who goes for the banana, their answer would almost certainly be: “Well, I don’t really know, but that’s how we’ve always done things around here.”

Continue Reading

Tags: , ,