Monday, August 24, 2009

Hey, everybody! Let's do a quick comparison of SimPy and PDQ for a single queue!

On the side I have slowly been looking into SimPy for queue simulation and I finally got around to doing a comparison of a single queue written up with SimPy and with PDQ.

The scenario is very simple. A single queue with a service time of 0.5 seconds with one visit for a service demand total of 0.5 seconds.

The PDQ solution is very simple to write up and only took a few minutes:




library("pdq");

lambda <- seq(1, 20, 1);
lambda <- lambda / 10;

for (rate in lambda) {
Init("Single Queue");

CreateOpen("WorkLoad", rate);

CreateNode("MyQueue", CEN, FCFS);

SetDemand("MyQueue", "WorkLoad" , 0.5);

SetWUnit("Trans");
SetTUnit("Second");

Solve(CANON);
#Report();
print(sprintf("%f, %f, %f",
rate,
GetResidenceTime("MyQueue","WorkLoad", TRANS),
GetUtilization("MyQueue", "WorkLoad", TRANS)));
};


The SimPy solution took a bit longer for me to write up and to make sure I was doing things correctly. I thought it was neat that I was able to setup a lambda into the queue that was greater than the queue was capable of processing, in this case, the lambda max is 2 requests per second as is given by the reciprocal of 0.5 seconds, 1/0.5 == 2.

PDQ gives a response time of "Inf" for 2.0 requests per second where as the SimPy solution will crunch away, but never get above 2 requests per second and will result with unbounded queueing as we'd expect.

I still need to work on my SimPy solution to add in more metrics collection, but it's a start. After I wrap my brain around metrics collection I'm going to write a solution of one of the previous problems that I worked with related to Jackson networks and the three tier E-biz solution.

One of the interesting things about SimPy is that I had to create a large number of worker "threads" to hit the queue over a long period of time to get to the correct lambda that I had specified. If I only created one worker "thread" there was no queueing taking place and wait time was always zero, as you'd normally expect. I played around a lot with the number of worker "threads" and found that for my open network that between 10,000 to 100,000 workers worked out just fine and that 10,000 seconds was an adequate time frame for the solution.

One of the first things I had to handle with the large number of worker "threads" was to control the inter arrival rate so that the requests per second of all the workers would be the correct value. After that, it was just playing around with the code to find a good number of simulation time and workers.




#!/usr/bin/env python

from SimPy.Simulation import *
from random import Random, expovariate, uniform
import time

class G:
Rnd = random.Random(12345)
MyQueue = Resource(1)
QMon = Monitor()
ServiceTime = 0.50

class WorkLoad(Process):

TotalCalls = 0L
TotalResidence = 0L
TotalWait = 0L
TotalService = 0L

def __init__(self):
Process.__init__(self)
self.StartUpTime = 0.0

def Run(self, MaxWorkLoad, Lambda):
InterArrivalRate = (1.0/Lambda)
while 1:
WaitTime = G.Rnd.expovariate(1.0/(InterArrivalRate*MaxWorkLoad))
yield hold, self, WaitTime
self.StartUpTime = now()
yield request, self, G.MyQueue
WorkLoad.TotalWait += now() - self.StartUpTime # W
yield hold, self, G.ServiceTime # S
G.QMon.observe(len(G.MyQueue.waitQ));
WorkLoad.TotalResidence += now() - self.StartUpTime # R = W + S
WorkLoad.TotalCalls += 1
yield release, self, G.MyQueue

def main():

MaxWorkLoad = 1000
Lambda = float(sys.argv[1])
MaxSimTime = 1000.00

WorkLoad.TotalCalls = 0L
WorkLoad.TotalResidence = 0L
WorkLoad.TotalWait = 0L
WorkLoad.TotalService = 0L

initialize()

print "MaxWorkLoad = ", MaxWorkLoad
print "MaxSimTime = ", MaxSimTime

for I in range(MaxWorkLoad):
W = WorkLoad()
activate(W, W.Run(MaxWorkLoad, Lambda))

TotalWorkLoad = I + 1

simulate(until=MaxSimTime)

print "Ideal Throughput : ", Lambda
print "Simulated Seconds : ", MaxSimTime
print "TotalWorkLoads : ", TotalWorkLoad
print "Number of calls : ", WorkLoad.TotalCalls
print "Throughput : ", WorkLoad.TotalCalls/MaxSimTime
print "Total Wait Time : ", WorkLoad.TotalWait
print "Total Residence Time : ", WorkLoad.TotalResidence
print "Mean Residence Time : ", WorkLoad.TotalResidence/WorkLoad.TotalCalls
print "Mean Wait Time : ", WorkLoad.TotalWait/WorkLoad.TotalCalls
print "Mean Service Time : ", (WorkLoad.TotalResidence/WorkLoad.TotalCalls) -
(WorkLoad.TotalWait/WorkLoad.TotalCalls)
print "Total Utilization : ", ((G.ServiceTime * WorkLoad.TotalCalls) / MaxSimTime) * 100 , " %"
print "Mean waitQ : ", G.QMon.mean()

if __name__ == '__main__': main()


I still have a lot of learning to do with SimPy, that's for sure. It's been almost a decade since I've messed with Python and I am really rusty. Time to take care of that.

Here is the comparison of metrics for utilization and response time for the single queue with both solutions in handy, dandy line graph form:

Utilization:



No surprises here as both lines are almost the same.

Response Time:



This is the big surprise for me. As utilization increases the response times between SimPy and PDQ start to diverge. At low utilizations there almost the same but look at around 30% utilization (0.6 requests per sec) and the divergence starts. I knew that there would be some differences between SimPy and PDQ but I didn't expect this much of a difference. I need to go over the SimPy solution and make sure I didn't fat finger anything as this is my first SimPy solution for a queue.

Queue Waiting:



Like the Response Time graph, we show that PDQ is gonkulating more queuing as a result of higher expected response times than the SimPy model.

While PDQ will respond with "Inf" for 2.0 requests per second and error out with anything greater than 2 req/sec the SimPy solution response time is asymptotic to 2 requests per second. I can push the SimPy solution is 1.9999 requests per second but never hit exactly 2.

Here is an example of the SimPy solution with 2.1 requests per second:




auswipe@auswipe-desktop:~/Desktop/pbe/simpy$ ./q.py 2.1
MaxWorkLoad = 10000
MaxSimTime = 10000.0
Ideal Throughput : 2.1
Simulated Seconds : 10000.0
TotalWorkLoads : 10000
Number of calls : 19976
Throughput : 1.9976
Total Wait Time : 1996958.71394
Total Residence Time : 2006728.44111
Mean Residence Time : 100.45697042
Mean Wait Time : 99.9678971738
Mean Service Time : 0.489073246233
Total Utilization : 99.88 %
Mean WaitQ : 18.6269640142


I've still got a long way to go with SimPy but I thought this was a decent first attempt with a simple problem.

No comments: