Thursday, January 28, 2010

Using R to Plot Universal Scaling Law Curve Automagically

The past two days I was doing some benchmarking of a system that might be used to stub back end systems and I was very interested in generating an USL curve based upon the benchmark data I was producing.

In previous posts I had shown how to use R to generate the kappa and sigma coefficients for use with the USL equation.

I start with the data that I've compiled on the system based upon the number of concurrent threads and the throughput measured:



   1:  p,x

   2:  1,1

   3:  3,2.998108449

   4:  5,4.995428752

   5:  7,6.928278689

   6:  10,7.539249685

   7:  13,11.50488651

   8:  15,11.79476671

   9:  20,15.65936318

  10:  30,13.61349306

  11:  40,16.50567465

  12:  50,14.83716898


The number of threads is represented by "p" and the throughput is represented by "x"

I wrote the following R function to automagically create a plot of the data points for the above file and generate the USL curve along the data points and highlight the maximum theoretical point for the USL:



   1:  # Example usage:

   2:  # plotUSL("c:/benchmark/benchmark.csv", "Benchmark data with USL curve")

   3:  # CSV file must have two columns with a header of "p, x"

   4:  # Example:

   5:  #     p, x

   6:  #    1, 1

   7:  #    2, 1.5

   8:  #    3, 2

   9:   

  10:  plotUSL <- function(dataFile, graphTitle) {

  11:    uslData <- read.csv(dataFile, header=TRUE);

  12:    uslData$c <- uslData$x / uslData$x[1];

  13:    usl <- nls(c ~ p/(1+sigma*(p-1)+kappa*p*(p-1)),

  14:               uslData,

  15:               algorithm="port",

  16:               start=c(sigma=0.0, kappa=0.0),

  17:               lower=c(0,0));

  18:    sigma <- coef(usl)["sigma"];

  19:    kappa <- coef(usl)["kappa"];

  20:    p <- 1:round(1.75 * max(uslData$p));

  21:    Relative_Capacity <- p/(1+sigma*(p-1)+kappa*p*(p-1));

  22:    plot(p, Relative_Capacity, type="l", ylim=c(0, round(max(Relative_Capacity, uslData$c)*1.1)));

  23:    points(uslData$p, uslData$c, pch=20);

  24:   

  25:    indexValue <- 1;

  26:    testValue  <- Relative_Capacity[1];

  27:    maxValue   <- max(Relative_Capacity);

  28:   

  29:    while(testValue != maxValue) {

  30:      indexValue = indexValue + 1;

  31:      testValue = Relative_Capacity[indexValue];

  32:    };

  33:   

  34:    points(p[indexValue], Relative_Capacity[indexValue], col=2, pch=13, cex=2.0);

  35:   

  36:    title(main=graphTitle, col.main="black", font.main=4, cex=1.5);

  37:    title(sub=sprintf("USL max at p = %d with Relative_Capacity = %f",

  38:          indexValue,

  39:          Relative_Capacity[indexValue]), 

  40:          col.main="black", 

  41:          font.main=4, 

  42:          cex=1.5);

  43:  };



And viola, we can quick and dirty results after invoking the function!

> plotUSL("g:/usl/benchmark.csv", "Benchmark data with USL curve")



This function could easily be modified to accept a third parameter and save off the image to a JPG, PNG or even PDF. But I leave that as an exercise for the reader.

Sunday, December 20, 2009

11001001

What a strange trip it has been.

Two weeks ago I got a call from an IT Mercenary outfit and my former employer wanted to bring me in for a face-to-face. This isn't the first time that my former employer and I have sniffed out each other but this is the first time that something came of it.

It appears that on 28 Dec I will be going back to my former employer doing the exact same gig (for now) as a contractor via the above mentioned IT Mercenary firm. It's not a bad rate and it is W2 with a benefits option. The only loss is vacation/sick time. However, if I work 60 hours in a week I get paid for 60 hours. I'll have to wait and see how that goes.

So, I've been away for roughly 19 months. What have I really done in that time?

Agile Project Management

Does Agile work? Yes and no from what I've seen first hand. It can work if management allows it to work. I've also seen Agile as an abstraction for sales. I worked for a firm that shouted Agile from the rooftops but did not allow it's own projects to run in an agile fashion. Agile was squashed for tyrannical project management with tacit upper management approval, developers opinions and experience be damned.

Alternative Load Testing Frameworks

I got a lot of good work in with Microsoft's load testing tool. Microsoft really needs to have a better naming scheme for it's testing tools. When you say, "LoadRunner" you don't have to explain it too much. Most people know that LoadRunner is for performance testing. Sure, they might not know how it works, but they know it has something to do with performance testing. The Microsoft solution? Visual Studio 2008 Team System Test Edition. That's a tadbit verbose. Sure, you know it's Visual Studio and that it is for the Team Edition and hey, you know it's with the Testing Edition. But still, hardly anybody really knows what the hell it does or how it does it. Visual Studio 2008 Team System Test Edition Coded Web Test is just too verbose. Try doing some job searches on indeed.com for "Visual Studio 2008 Team System Test Edition Coded Web Test" and see how many hits you get versus the generic "LoadRunner."

I also got to mess around a lot with JMeter. I found JMeter to be handy little tool. I felt a little constrained by the GUI nature of JMeter but still, it gets the job done and has more protocols available than the Microsoft Tool. I preferred the Microsoft tool as I could develop virtual users in C# versus JMeter use of GUI elements and beanshell coding for extraction and correlation.

While not a performance testing tool, I did get to play with QTP a bit. I'd like to get to know QTP better. It seems to be a real handy tool. I'm also excited by the UI testing tool that will be part of VS2010, as well.

Deployment Tools

I got to mess around with tools like NAnt, MSBuild and WiX. Of the three I prefer NAnt but generally the XML files get very tiresome. I'm still a perl kind of guy at heart but of the three I prefer NAnt. I got to extend NAnt with objects created for MSBuild via a wrapper which was nifty and solved a problem.

Project Planning

For a project that I worked on for the Veterans Administration I gutted and re-wrote the performance engineering plan. The end document was over 90 pages of generic performance testing documentation.

Capacity Analysis Tools

Again for the Veterans Administration I did a comparison of different capacity analysis tools and techniques and the way of combining the efforts of performance testing and capacity analysis and trending. I found the books by Neil Gunther to be most excellent along with the PDQ API. While TeamQuest Model is popular I found the interface to be extremely clunky. It reminded me of VB4/Powerbuilder era interfaces.

In the past I've done plenty of empirical capacity analysis based upon my load testing results and observations of production systems but this was the first time that I learned the math behind heuristic queuing analysis and the various tools related to queuing theory.

Of all the things that I learned on my little adventure I think that this ranks near the top in terms of importance and application to future endeavors. It really is quite handy and goes hand-in-hand with performance testing. For my last position I recommended the Python versis of PDQ, PyDQ. I selected PyDQ because I like the clean duck typing syntax of Python coupled with the available IDEs for Windows and Linux.

The Joys of Virtualization

I used Hyper-V and VirtualBox on a lot of projects. Virtualization just plain rocks. Prototyping complete systems is just incredibly cool. I found the snapshot system of Hyper-V to be better performing than VirtualBox but VirtualBox is much more flexible as VirtualBox is not bound to Windows 2008. When I was testing deployment scripts with NAnt, Hyper-V was invaluable. Test, check results, fix, rollback and test again. Fabulous! Both Hyper-V R2 and VirtualBox 3.1 support live motion of VMs which is pretty cool. I really hope that Oracle doesn't kill off VirtualBox. VirtualBox just rocks.

I used VirtualBox to build a JMS prototype system with multiple clients processing messages. For another learning experience I had built an OpenSolaris server exporting an iSCSI target. I learned that ZFS is one of the greatest file systems that I have ever used. Really cool stuff.

Python

For some of the capacity analysis tools I needed to learn Python and I found that I liked what I found. I've done some Python development in the past (early 2000s) but never did follow up as I used perl for what I needed. Python led to learning Django and my desire to further learn MVC with different frameworks. I also played around a lot with different Python development platforms like Eclipse with PyDev and NetBeans with support for Python. Pretty darn cool.

Java

For the VA project I did a bit of JMS and Swing development in Java. I've never done any Java development and I started to learn Java development. I'm now continuing learning Java in continuing education courses at the local community college. While I may not be utilizing Java right now it's a good thing to learn. I would have liked to have done more JMS development but alas, that was put on the back burner when my project was coming to an end in favor of the Swing development. I've never been much of a GUI developer and still don't have any major desire to do front ends. In my heart I'm a command line kinda guy.

I covered metrics collection and tuning of J2EE apps but I've only truly done J2SE development. The VA project was supposed to expose me to WebLogic but unfortunately that never came to fruition. I've already forgotten all the Oracle training that I crammed into my brain the first few weeks of being on the project.

R

I had the opportunity to do some R development to aid in number crunching and I found R to be very handy. If I do a lot of statistical number crunching I'm more likely to do it in R than I am do write perl routines to get the same results. I especially like the built in graphing capabilities of R both in quality and ability to generate jpg and PDF results. I started out learning R as part of PDQ-R but ultimately for the project I was working on I selected PyDQ, the Python implementation of PDQ for the robustness and the widely available IDEs for Python both under Windows and Linux.

Skill Development

In the November time frame I had a job interview and while I did good on the CS portion of the interview (math, algorithms analysis, etc) I bombed the actual coding portion. The coding portion of the interview was using C# to solve a problem using interfaces. I've implemented interfaces in the past but this solution was using interfaces in a way that I had never used them. That told me that it was time to start developing my coding skill set. I've been taking some Java classes at the local community college and have plans to continue with the advanced Java class along with Hibernate and Spring. Also covered will be Ruby on Rails.

Working from Home

I worked from home for 11 months and I've decided that working from home sucks. I am not a homebody and get cabin fever easily. Face to face communication is definitely better than phone conference meetings with 15 people. I found Skype to be real handy for voice conferencing with 1-800 numbers. Working from home wasn't good for my waist line either. I'm looking to getting back into a regular work routine and making the use of the facility gym with my old trainer and utilizing the running track again once I've dropped a few pounds. Sure, I got up to a 405 pound machine bench press (useful for throwing developers over cube walls) but I've let the old BF% get up too high. Bah!

Working on Government Projects

I've worked on government projects before, specifically USAF and DoD stuff but this was the first time that I worked on a VA project. Oy! What a mess. Remember the messed up logistics systems from when you were in the US Army and twist the entropy knob to eleven. Yeah, it was really that bad. The sad thing was that the project I was working on was supposed to help those veterans that put their lives on the line for their country and this was the best that could be done? I shudder to think of Guvment Health Care being ran like the project that I was on.

Books

Here is a short list of books that I found to be handy:

Analyzing Computer System Performance with Perl::PDQ
Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services
Performance by Design: Computer Capacity Planning By Example
The Art of Capacity Planning: Scaling Web Resources
The Art of Application Performance Testing: Help for Programmers and Quality Assurance
Pro Java EE 5 Performance Management and Optimization

What's next? Who knows? Can't wait to find out.

Monday, November 30, 2009

Time to map out all the soup kitchens.

Well, I finally packed up my work laptop and sent it back to my employer. As of tomorrow I will be unemployed. But I do have some phone interviews setup so that isn't too bad for a December. Need to file for unemployment tomorrow.

Thursday, November 5, 2009

VirtualBox is amazing

So I went to the Microsoft event yesterday and got a Win 7 Ultimate license.

A few years back I had done the same and got a Vista Ultimate key and got a copy of the Vista Ultimate 64-bit ISO. When I tried to use the key that was distributed with the 32-bit DVD from the event it didn't work. As a student I was able to get a discount license for Vista Ultimate and for some reason, the key that I got from the Microsoft event worked with the ISO that I had purchased. Weird.

I thought maybe that would happen again as the handout DVD was only for the 32-bit Win 7 Ultimate. I got a copy of a Win 7 Ultimate 64-bit ISO and decided that I would test out the install and see if the key would work correctly.

My main rig, an i7 920 machine with 12 gigs of RAM was busy performing a backup so I decided to fire up VirtualBox on my now aging Acer Ferrari 5000 machine and did a VM install of Win 7 Ultimate 64-bit to a VM and verified that the key worked just fine (yay!).

It wasn't until this morning that I realized that I had installed Win 7 Ultimate 64-bit virtual machine on a laptop running Vista 32-bit. And it worked.

Wow.

That just rocks.

Friday, October 9, 2009

I'm dreaming of a laid off Chrismas...

Got a call. Looks like the program that I am on is not going to be funded past mid-November. I suspect that if laid off in mid-November two weeks before Thanksgiving and the Holiday season not much is going to happen job wise, just like last year.

At least this time I have some heads up warning and can plan accordingly unlike my previous stint at VT (may JC and RZ rot in the lower bowels of Hell). So, I don't think it'll be so bad this time around.

During this hiatus I think I'll give The Grinder a closer look as I haven't had a chance to use it yet on anything and I've been doing a lot more Python coding as of late (PyDQ and SimPy stuff). An updated Grinder plug-in has been created for Eclipse and hopefully it works out better than my previous attempt at using Grinderstone.

Good times!

Monday, August 31, 2009

Look, everybody! It's a SimPy three tier eBiz simulation! But now with extra statistics! Wowee!

Added some more statistics to the output such as waiting queue length and component utilization by using SimPy Monitors. I couldn't easily use the monitors for the page response times due to the split up nature calls to multiple nodes but for .waitQ and node utilizations they worked out perfectly.



#!/usr/bin/env python

from SimPy.Simulation import *
from random import Random, expovariate, uniform

class Metrics:

metrics = dict()

def Add(self, metricName, frameNumber, value):
if self.metrics.has_key(metricName):
if self.metrics[metricName].has_key(frameNumber):
self.metrics[metricName][frameNumber].append(value)
else:
self.metrics[metricName][frameNumber] = list()
self.metrics[metricName][frameNumber].append(value)
else:
self.metrics[metricName] = dict()
self.metrics[metricName][frameNumber] = list()
self.metrics[metricName][frameNumber].append(value)

def Keys(self):
return self.metrics.keys()

def Mean(self, metricName):
valueArray = list()
if self.metrics.has_key(metricName):
for frame in self.metrics[metricName].keys():
for values in range(len(self.metrics[metricName][frame])):
valueArray.append(self.metrics[metricName][frame][values])

sum = 0.0
for i in range(len(valueArray)):
sum += valueArray[i]

if len(self.metrics[metricName][frame]) != 0:
return sum/len(self.metrics[metricName])
else:
return 0 # Need to learn python throwing exceptions
else:
return 0

class G:

numWS = 1
numAS = 1
numDS = 1

Rnd = random.Random(12345)

PageNames = ["Entry", "Home", "Search", "View", "Login", "Create", "Bid", "Exit" ]

Entry = 0
Home = 1
Search = 2
View = 3
Login = 4
Create = 5
Bid = 6
Exit = 7

WS = 0
AS = 1
DS = 2

CPU = 0
DISK = 1

WS_CPU = 0
WS_DISK = 1
AS_CPU = 2
AS_DISK = 3
DS_CPU = 4
DS_DISK = 5

metrics = Metrics()

# e h s v l c b e
HitCount = [0, 0, 0, 0, 0, 0, 0, 0]

Resources = [[ Resource(1), Resource(1) ], # WS CPU and DISK
[ Resource(1), Resource(1) ], # AS CPU and DISK
[ Resource(1), Resource(1) ]] # DS CPU and DISK

QMon = [[ Monitor(1), Monitor(1)], # WS CPU and DISK
[ Monitor(1), Monitor(1)], # AS CPU and DISK
[ Monitor(1), Monitor(1)]] # DS CPU and DISK

SMon = [[ Monitor(1), Monitor(1)], # WS CPU and DISK
[ Monitor(1), Monitor(1)], # AS CPU and DISK
[ Monitor(1), Monitor(1)]] # DS CPU and DISK

# Enter Home Search View Login Create Bid Exit
ServiceDemand = [ [0.000, 0.008, 0.009, 0.011, 0.060, 0.012, 0.015, 0.000], # WS_CPU
[0.000, 0.030, 0.010, 0.010, 0.010, 0.010, 0.010, 0.000], # WS_DISK
[0.000, 0.000, 0.030, 0.035, 0.025, 0.045, 0.040, 0.000], # AS_CPU
[0.000, 0.000, 0.008, 0.080, 0.009, 0.011, 0.012, 0.000], # AS_DISK
[0.000, 0.000, 0.010, 0.009, 0.015, 0.070, 0.045, 0.000], # DS_CPU
[0.000, 0.000, 0.035, 0.018, 0.050, 0.080, 0.090, 0.000] ] # DS_DISK

# Type B shopper
# 0 1 2 3 4 5 6 7
TransitionMatrix = [ [0.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00], # 0
[0.00, 0.00, 0.70, 0.00, 0.10, 0.00, 0.00, 0.20], # 1
[0.00, 0.00, 0.45, 0.15, 0.10, 0.00, 0.00, 0.30], # 2
[0.00, 0.00, 0.00, 0.00, 0.40, 0.00, 0.00, 0.60], # 3
[0.00, 0.00, 0.00, 0.00, 0.00, 0.30, 0.55, 0.15], # 4
[0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 5
[0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00], # 6
[0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00] ] # 7

class RandomPath:

def RowSum(self, Vector):
rowSum = 0.0
for i in range(len(Vector)):
rowSum += Vector[i]
return rowSum

def NextPage(self, T, i):
rowSum = self.RowSum(T[i])
randomValue = G.Rnd.uniform(0, rowSum)

sumT = 0.0

for j in range(len(T[i])):
sumT += T[i][j]
if randomValue < sumT:
break
return j

class ExecuteFrame(Process):

def __init__(self, frameNumber, resource, QMon, SMon, serviceDemand, nodeName, pageName):
Process.__init__(self)
self.frame = frameNumber
self.resource = resource
self.serviceDemand = serviceDemand
self.nodeName = nodeName
self.pageName = pageName
self.QMon = QMon
self.SMon = SMon

def execute(self):
StartUpTime = now()

yield request, self, self.resource
yield hold, self, self.serviceDemand

self.QMon.observe(len(self.resource.waitQ))
self.SMon.observe(self.serviceDemand)

yield release, self, self.resource

R = now() - StartUpTime

G.metrics.Add(self.pageName, self.frame, R)

class CallPage(Process):

def __init__(self, frameNumber, node, pageName):
Process.__init__(self)
self.frame = frameNumber
self.StartUpTime = 0.0
self.currentPage = node
self.pageName = pageName

def execute(self):

if self.currentPage != G.Exit:

print >> sys.stderr, "Working on Frame # ", self.frame, " @ ", now() , " for page ", self.pageName

self.StartUpTime = now()

if G.ServiceDemand[G.WS_CPU][self.currentPage] > 0.0:
wsCPU = ExecuteFrame(self.frame, \
G.Resources[G.WS][G.CPU], \
G.QMon[G.WS][G.CPU], \
G.SMon[G.WS][G.CPU], \
G.ServiceDemand[G.WS_CPU][self.currentPage]/G.numWS, \
"wsCPU", self.pageName)
activate(wsCPU, wsCPU.execute())

if G.ServiceDemand[G.WS_DISK][self.currentPage] > 0.0:
wsDISK = ExecuteFrame(self.frame, \
G.Resources[G.WS][G.DISK], \
G.QMon[G.WS][G.DISK], \
G.SMon[G.WS][G.DISK], \
G.ServiceDemand[G.WS_DISK][self.currentPage]/G.numWS, \
"wsDISK", self.pageName)
activate(wsDISK, wsDISK.execute())

if G.ServiceDemand[G.AS_CPU][self.currentPage] > 0.0:
asCPU = ExecuteFrame(self.frame, \
G.Resources[G.AS][G.CPU], \
G.QMon[G.AS][G.CPU], \
G.SMon[G.AS][G.CPU], \
G.ServiceDemand[G.AS_CPU][self.currentPage]/G.numAS, \
"asCPU", \
self.pageName)
activate(asCPU, asCPU.execute())

if G.ServiceDemand[G.AS_DISK][self.currentPage] > 0.0:
asDISK = ExecuteFrame(self.frame, \
G.Resources[G.AS][G.DISK], \
G.QMon[G.AS][G.DISK], \
G.SMon[G.AS][G.DISK], \
G.ServiceDemand[G.AS_DISK][self.currentPage]/G.numAS, \
"asDISK", \
self.pageName)
activate(asDISK, asDISK.execute())

if G.ServiceDemand[G.DS_CPU][self.currentPage] > 0.0:
dsCPU = ExecuteFrame(self.frame, \
G.Resources[G.DS][G.CPU], \
G.QMon[G.DS][G.CPU], \
G.SMon[G.DS][G.CPU], \
G.ServiceDemand[G.DS_CPU][self.currentPage]/G.numDS, \
"dsCPU", \
self.pageName)
activate(dsCPU, dsCPU.execute())

if G.ServiceDemand[G.DS_DISK][self.currentPage] > 0.0:
dsDISK = ExecuteFrame(self.frame, \
G.Resources[G.DS][G.DISK], \
G.QMon[G.DS][G.DISK], \
G.SMon[G.DS][G.DISK], \
G.ServiceDemand[G.DS_DISK][self.currentPage]/G.numDS, \
"dsDISK", \
self.pageName)
activate(dsDISK, dsDISK.execute())

G.HitCount[self.currentPage] += 1

yield hold, self, 0.00001

class Generator(Process):
def __init__(self, rate, maxT):
Process.__init__(self)
self.name = "Generator"
self.rate = rate
self.maxT = maxT
self.g = Random(11335577)
self.i = 0
self.currentPage = G.Home

def execute(self):
while (now() < self.maxT):
self.i+=1
p = CallPage(self.i,self.currentPage,G.PageNames[self.currentPage])
activate(p,p.execute())
yield hold,self,self.g.expovariate(self.rate)
randomPath = RandomPath()

if self.currentPage == G.Exit:
self.currentPage = G.Home
else:
self.currentPage = randomPath.NextPage(G.TransitionMatrix, self.currentPage)

def main():

Lambda = 4.026*float(sys.argv[1])
maxSimTime = float(sys.argv[2])

initialize()
g = Generator(Lambda, maxSimTime)
activate(g,g.execute())

simulate(until=maxSimTime)

print >> sys.stderr, "Simulated Seconds : ", maxSimTime

print >> sys.stderr, "Page Hits :"
for i in range(len(G.PageNames)):
print >> sys.stderr, "\t", G.PageNames[i], " = ", G.HitCount[i]

print >> sys.stderr, "Throughput : "
for i in range(len(G.PageNames)):
print >> sys.stderr, "\t", G.PageNames[i], " = ", G.HitCount[i]/maxSimTime

print >> sys.stderr, "Mean Response Times:"

for i in G.metrics.Keys():
print >> sys.stderr, "\t", i, " = ", G.metrics.Mean(i)

print >> sys.stderr, "Component Waiting Queues:"
print >> sys.stderr, "\tWeb Server CPU : ", G.QMon[G.WS][G.CPU].mean()
print >> sys.stderr, "\tWeb Server DISK : ", G.QMon[G.WS][G.DISK].mean()
print >> sys.stderr, "\tApplication Server CPU : ", G.QMon[G.AS][G.CPU].mean()
print >> sys.stderr, "\tApplication Server DISK : ", G.QMon[G.AS][G.DISK].mean()
print >> sys.stderr, "\tDatabase Server CPU : ", G.QMon[G.DS][G.CPU].mean()
print >> sys.stderr, "\tDatabase Server DISK : ", G.QMon[G.DS][G.DISK].mean()

print >> sys.stderr, "Component Utilization:"
print >> sys.stderr, "\tWeb Server CPU : ", ((G.SMon[G.WS][G.CPU].mean()*len(G.QMon[G.WS][G.CPU]))/maxSimTime)*100
print >> sys.stderr, "\tWeb Server DISK : ", ((G.SMon[G.WS][G.DISK].mean()*len(G.QMon[G.WS][G.DISK]))/maxSimTime)*100
print >> sys.stderr, "\tApplication Server CPU : ", ((G.SMon[G.AS][G.CPU].mean()*len(G.QMon[G.AS][G.CPU]))/maxSimTime)*100
print >> sys.stderr, "\tApplication Server DISK : ", ((G.SMon[G.AS][G.DISK].mean()*len(G.QMon[G.AS][G.DISK]))/maxSimTime)*100
print >> sys.stderr, "\tDatabase Server CPU : ", ((G.SMon[G.DS][G.CPU].mean()*len(G.QMon[G.DS][G.CPU]))/maxSimTime)*100
print >> sys.stderr, "\tDatabase Server DISK : ", ((G.SMon[G.DS][G.DISK].mean()*len(G.QMon[G.DS][G.DISK]))/maxSimTime)*100

print >> sys.stderr, "Total Component Hits :"
print >> sys.stderr, "\tWeb Server CPU : ", len(G.QMon[G.WS][G.CPU])
print >> sys.stderr, "\tWeb Server DISK : ", len(G.QMon[G.WS][G.DISK])
print >> sys.stderr, "\tApplication Server CPU : ", len(G.QMon[G.AS][G.CPU])
print >> sys.stderr, "\tApplication Server DISK : ", len(G.QMon[G.AS][G.DISK])
print >> sys.stderr, "\tDatabase Server CPU : ", len(G.QMon[G.DS][G.CPU])
print >> sys.stderr, "\tDatabase Server DISK : ", len(G.QMon[G.DS][G.DISK])

print >> sys.stderr, "Total Component Thoughput :"
print >> sys.stderr, "\tWeb Server CPU : ", len(G.QMon[G.WS][G.CPU])/maxSimTime
print >> sys.stderr, "\tWeb Server DISK : ", len(G.QMon[G.WS][G.DISK])/maxSimTime
print >> sys.stderr, "\tApplication Server CPU : ", len(G.QMon[G.AS][G.CPU])/maxSimTime
print >> sys.stderr, "\tApplication Server DISK : ", len(G.QMon[G.AS][G.DISK])/maxSimTime
print >> sys.stderr, "\tDatabase Server CPU : ", len(G.QMon[G.DS][G.CPU])/maxSimTime
print >> sys.stderr, "\tDatabase Server DISK : ", len(G.QMon[G.DS][G.DISK])/maxSimTime

print >> sys.stderr, "Mean Component Svc Demand :"
print >> sys.stderr, "\tWeb Server CPU : ", G.SMon[G.WS][G.CPU].mean()
print >> sys.stderr, "\tWeb Server DISK : ", G.SMon[G.WS][G.DISK].mean()
print >> sys.stderr, "\tApplication Server CPU : ", G.SMon[G.AS][G.CPU].mean()
print >> sys.stderr, "\tApplication Server DISK : ", G.SMon[G.AS][G.DISK].mean()
print >> sys.stderr, "\tDatabase Server CPU : ", G.SMon[G.DS][G.CPU].mean()
print >> sys.stderr, "\tDatabase Server DISK : ", G.SMon[G.DS][G.DISK].mean()

print G.HitCount[G.Home]/maxSimTime, ",", G.metrics.Mean("Home"), ",", G.metrics.Mean("View"), ",", G.metrics.Mean("Search"), ",", G.metrics.Mean("Login"), ",", G.metrics.Mean("Create"), ",", G.metrics.Mean("Bid")

if __name__ == '__main__':
main()

Sunday, August 30, 2009

Improved single queue SimPy code

In this blog entry I had a comparison of PDQ and SimPy for a single queue. The code that I wrote was pretty cheezy and based upon a very early example from Norm Matloff's Introduction to Discrete-Event Simulation and the SimPy Language.

After writing the SimPy code to solve for the three tier eBiz solution I wanted to go back and correct some of my initial code with the single queue.

Below is what I feel is better code than my original single queue solution.




#!/usr/bin/env python

from SimPy.Simulation import *
from random import Random, expovariate, uniform
import time

class G:
# Rnd = random.Random(time.mktime(time.localtime()))
Rnd = random.Random(12345)
MyQueue = Resource(1)
QMon = Monitor()
ServiceTime = 0.50
TotalCalls = 0L
TotalResidence = 0L
TotalWait = 0L
TotalService = 0L

class WorkLoad(Process):

def __init__(self):
Process.__init__(self)
self.StartUpTime = 0.0

def Run(self):
self.StartUpTime = now()
yield request, self, G.MyQueue
G.TotalWait += now() - self.StartUpTime
yield hold, self, G.ServiceTime
G.QMon.observe(len(G.MyQueue.waitQ));
G.TotalResidence += now() - self.StartUpTime
yield release, self, G.MyQueue
G.TotalCalls += 1

class Generator(Process):

def __init__(self, Lambda):
Process.__init__(self)
self.Lambda = Lambda

def execute(self):

while 1:
yield hold, self, G.Rnd.expovariate(self.Lambda)
W = WorkLoad()
activate(W, W.Run())

def main():

Lambda = float(sys.argv[1])
MaxSimTime = 10000.00

G.TotalCalls = 0L
G.TotalResidence = 0L
G.TotalWait = 0L
G.TotalService = 0L

initialize()

print >> sys.stderr, "MaxSimTime = ", MaxSimTime

g = Generator(Lambda)
activate(g, g.execute())

simulate(until=MaxSimTime)

print Lambda, ",", MaxSimTime, ",", G.TotalCalls, ",", G.TotalCalls/MaxSimTime, ",", G.TotalWait, ",", G.TotalResidence, ",", G.TotalResidence/G.TotalCalls, ",", G.TotalWait/G.TotalCalls, ",", (G.TotalResidence/G.TotalCalls) - (G.TotalWait/G.TotalCalls) , ",", ((G.ServiceTime * G.TotalCalls) / MaxSimTime) * 100 , "%",",",G.QMon.mean()

print >> sys.stderr, "Ideal Throughput : ", Lambda
print >> sys.stderr, "Simulated Seconds : ", MaxSimTime
print >> sys.stderr, "Number of calls : ", G.TotalCalls
print >> sys.stderr, "Throughput : ", G.TotalCalls/MaxSimTime
print >> sys.stderr, "Total Wait Time : ", G.TotalWait
print >> sys.stderr, "Total Residence Time : ", G.TotalResidence
print >> sys.stderr, "Mean Residence Time : ", G.TotalResidence/G.TotalCalls
print >> sys.stderr, "Mean Wait Time : ", G.TotalWait/G.TotalCalls
print >> sys.stderr, "Mean Service Time : ", (G.TotalResidence/G.TotalCalls) - (G.TotalWait/G.TotalCalls)
print >> sys.stderr, "Total Utilization : ", ((G.ServiceTime * G.TotalCalls) / MaxSimTime) * 100 , " %"
print >> sys.stderr, "Mean WaitQ : ", G.QMon.mean()

if __name__ == '__main__': main()


The results are the same as before, but now instead of creating a gajillion objects to activate only one "thread" is created that in turns makes the calls to the queue. Much faster than my original code and laid out better.