Monday, June 1, 2009

Automating the finding of coefficients for the USL

I got to playing around with R more and I've always found that to learn a language I need to solve problems with the language. I'm sure most everybody else does the same thing. My goal was to write a R function that imported a CSV with performance information to gonkulate against.

In my case, I'm using the basic performance information from "Guerrilla Capacity Planning." I've created a CSV file with the number of procs and the resultant ray trace benchmark from Table 5.1:




   1:  C:\Users\auswipe\Desktop>cat raw_throughput.csv

   2:  p,x

   3:  1,20

   4:  4,78

   5:  8,130

   6:  12,170

   7:  16,190

   8:  20,200

   9:  24,210

  10:  28,230

  11:  32,260

  12:  48,280

  13:  64,310



Then I wrote an R function to crunch the numbers:




   1:  uslCoefficients <- function(dataFile) {

   2:    uslData <- read.csv(dataFile, header=TRUE);

   3:    uslData$c <- uslData$x / uslData$x[1];

   4:    usl <- nls(c ~ p/(1+sigma*(p-1)+kappa*p*(p-1)),

   5:               uslData,

   6:               algorithm="port",

   7:               start=c(sigma=0.0, kappa=0.0),

   8:               lower=c(0,0));

   9:    sigma <- coef(usl)["sigma"];

  10:    kappa <- coef(usl)["kappa"];

  11:    return(list(sigma=sigma, kappa=kappa));

  12:  };



The function uslCoefficient returns a list where I can reference the "sigma" and "kappa" by named index:




   1:  > uslCoef <- uslCoefficients("c:\\Users\\auswipe\\Desktop\\raw_throughput.csv")

   2:  > uslCoef["sigma"]

   3:  $sigma

   4:      sigma 

   5:  0.0497973 

   6:   

   7:  > uslCoef["kappa"]

   8:  $kappa

   9:         kappa 

  10:  1.143404e-05 

  11:   

  12:  > uslCoef

  13:  $sigma

  14:      sigma 

  15:  0.0497973 

  16:   

  17:  $kappa

  18:         kappa 

  19:  1.143404e-05 



R is pretty nifty. I doubt I'll ever make use of all the power that is available but it'll be better than writing my own stat routines.

Using R to calculate coefficients of the Universal Scaling Law with Non-Linear Regression

Ooh! Doesn't that sound fancy?

Several weeks ago I purchased the eBook from O'Reilly called "The Art of Capacity Planning." I've always thought that load testing and capacity planning went hand-in-hand. One is not a replacement for the other but one can assist with the other. Load test helps out capacity planning by applying load to psuedo-production systems and capacity planning helps load testing by verifying results in load test against real world systems.

I finished "The Art of Capacity Planning" and wanted to read more on the subject and picked up a copy of "Guerrilla Capacity Planning" which has a lot more math than "The Art of Capacity Planning." One of the concepts is the Universal Scaling Law based on Amdhal's Law. Dr. Neil J Gunther is a smart cookie. He even has a Ph.D in Theoretical Physics which makes him closer to Gordon Freeman than I'll ever be! (Side question: Do Ph.D's in Theoretical Physics get crowbars at graduation?)

Anyhoo, in section 5.6.1 one of the methods in the book is to use Excel to do second degree polynomial regression for the calculation of two necessary coefficients, sigma and kappa. But when I tried to use Excel I was getting a negative value for sigma and one of the rules of the Universal Scaling Law is that the coefficients can never, ever, ever be negative. I just figured that I fat fingered something and tried it again and once again, got mismatching results.

I scratched my noggin, tried to figure out where I err'd and did some Googling and came across this entry of Dr. Gunther's blog:

Negative Scalability Coefficients in Excel

Because in Excel (and some other packages, like my TI-89) you can't put a constraint on the lower limits of the coefficient, you might from time to time get negative coefficients. But from reading the blog entry I see that other people are using R with success.

This is the first time that I've ever messed around with R for statistical purposes. In the past I've written some stat routines (years ago!) in C# for comparing before/after load testing results.

Here is how I used R from start to finish to gonkulate the coefficients.

Using the data from Section 5.3 I did the following in R:

First I defined my p array, which in the book is the number of processors used for ray tracing:




   1:  p <- c(1, 4, 8, 12, 16, 20, 24, 28, 32, 48, 64)



Then I defined my c array, which is the relative capacity for the number of processors used for ray tracing:




   1:  c <- c(1.0, 3.9, 6.5, 8.5, 9.5, 10.0, 10.5, 11.5, 13.0, 14.0, 15.5)



I combined both arrays into a data frame for later use.




   1:  df <- data.frame(p, c)



And when I check out the contents of df I get:




   1:  df

   2:      p    c

   3:  1   1  1.0

   4:  2   4  3.9

   5:  3   8  6.5

   6:  4  12  8.5

   7:  5  16  9.5

   8:  6  20 10.0

   9:  7  24 10.5

  10:  8  28 11.5

  11:  9  32 13.0

  12:  10 48 14.0

  13:  11 64 15.5



I can now use a non-linear regression routine with my data frame that I entered above.




   1:  usl <- nls(c ~ p/(1+sigma*(p-1)+kappa*p*(p-1)), df, algorithm="port", start=c(sigma=0.0, kappa=0.0), lower=c(0,0))



I can then access the coefficients by named index:




   1:  sigma <- coef(usl)["sigma"]

   2:  kappa <- coef(usl)["kappa"]

   3:   

   4:  sigma

   5:      sigma 

   6:  0.0497973 

   7:   

   8:  kappa

   9:         kappa 

  10:  1.143404e-05 

  11:   



Huzzah!

I can now interpolate the relative capacity based upon the USL and the coefficients that were previously gonkulated and add that to my current data frame, df, that I defined earlier. I do have to note that I was a slackard and did not apply the significant digits rules as outlined in Chapter 3 of "Guerrilla Capacity Planning."




   1:  df$proj_c <- p/(1 + sigma * (p - 1) + kappa * p * (p - 1))



There are the projected relative capacities. Yay!




   1:  df

   2:      p    c    proj_c

   3:  1   1  1.0  1.000000

   4:  2   4  3.9  3.479686

   5:  3   8  6.5  5.929346

   6:  4  12  8.5  7.745536

   7:  5  16  9.5  9.144406

   8:  6  20 10.0 10.253815

   9:  7  24 10.5 11.154233

  10:  8  28 11.5 11.898837

  11:  9  32 13.0 12.524174

  12:  10 48 14.0 14.259114

  13:  11 64 15.5 15.298811



And here I will make a simple little graph of the actual versus projected relative capacity:




   1:  plot(p, c)

   2:  lines(p, proj_c)



And here is the graph that is generated:



Kinda nifty, eh?

I can see myself using R more in the future. I'd rather write routines for automagic analysis of data with R than write my own routines from the ground up.

Saturday, May 2, 2009

Push-To-Test

I went to a four hour presentation on Push-To-Test yesterday. It seems pretty nifty, the idea of wrapping an automated testing framework around a bunch of open source projects such as Selenium, soapUI and other goodies.

The presentation was a little hectic but it got the general idea across. I would have preferred to get to the meat of the subject quicker but the presenter did have 20 people to deal with and we had to go with the common denominator. No biggie.

The only thing that I wasn't too big was the idea of converting the Selenium tests for web tests from XUL into Java or Jython for more programmable control. After the conversion there is no going back to the original SeleniumIDE tool from what I saw. But, I guess that isn't much different from what I've done with LoadRunner and VSTS a bajillion times before when I had a bunch of custom code from the tree view into code. They did have an IDE so it's not too bad in hind site. I didn't get a chance to play around with the IDE to see how it compares to Eclipse or Visual Studio.

But, it is free for download to use the unsupported version, so I say thumbs up.

The idea of using Selenium for the web tests is pretty nifty as you get good control with AJAX controls that are a bit of a pain to control in HTTP Virtual user. I hadn't thought of doing that in the past.

I was told that transactions and nested transactions were supported but didn't get a chance to see them in action.

Thursday, March 26, 2009

Learning QTP

It's been a long time between blog posts. Since this blog is mainly for myself, that isn't a problem. :-)

I'm starting to learn QTP and finding that it looks like a real handy tool for functional and integration testing. I'm still trying to get myself to not look at it as an alternate load testing tool and trying to use the interface and not look at each problem as more code to be written. I had the same problem initially with JMeter as well having come from a background where I would normally add a bunch of custom code after the initial script creation.

I'm not a big fan of VBScript but for the purpose at hand I have to say it is probably a better choice than using C like LoadRunner. Easier to create a COM object to be invoked with CreateObject for slicing and dicing of data.

I'm not sure if raw HTML can be pulled in with QTP, yet. I wouldn't be surprised if it can. QTP seems to be a pretty nifty tool. No doubt it will help with future job searches.

Thursday, January 8, 2009

Got a job!

I accepted an informal job offer today at the same pay as my previous job (which wasn't too shabby). I will be a performance engineer on a rather large project involving Red Hat RHEL 4.5, Orace and Java/WebLogic. Here is a really good opportunity to bone up on Java performance analysis along with WebLogic and Java. I'm not much of a Java dude so I will finally be getting up to speed.

I'm not to worried about the RHEL as I've been using various flavors of Unices (primarily FreeBSD) over the years. I won't get into the whole debate of genetic UNIX versus copied UNIX. Just not worth wasting my breath.

Another opportunity to learn Oracle performance tuning as well. I've done a lot with SQL Server over the years and I hope that some of it applies to the Oracle side of the house.

It looks like I'll be back to using Load Runner again. Looks like either Winsock/SOA or HTTP vusers. I've heard some bad things about the SOA Vusers with respect to generated script complexity and ease of correlation of data. All I know at this point is that the client is a swing UI running a Java application that will connect to the middle tier. Hopefully it is all done via SOA/WebServies. I have an interesting idea of utilizing WireShark and chaosreader.pl and some custom perl scripts for generationg clean vusers for the probject. We'll just have to wait and see how it goes.

Another interesting thing will be working out of the house full time. I have two previous coworkers that do this and they love it but I am a very social creature. We'll see how it goes. I will get a laptop and an aircard from my new employer so that means I will be able to work from any location so I should be able to get out of the house often and see friends for lunch, which has always been important to me.

So, my adventure continues, just not on the original vector. :-)

Thursday, December 11, 2008

Taking some time to keep on learning.

Since I am unemployed and not having anything better to do with my time I decided I would check out "Apache JMeter" by Emily H. Halili (Packt, 2008).

It's a light read and I found that Chapter 7 (Advanced Features) was the most helpful to myself as I am not a n00b to JMeter. However, I felt that Chapter 7 would have been better served if BeanShell processing would have been tackled. I've found from my load testing experience in the past that at some point the return HTML is gonna have to be sliced and diced and data extracted that cannot be done with a simple RegEx extraction (like what is covered in Chapter 7 of the book).

Other than that, I think that if you are a total n00b to JMeter it isn't half bad as a simple introduction to using JMeter for performance/load testing. There's a lot more to performance/load testing than the book covers such as metrics collection, number crunching, et cetera but the book doesn't purport itself to be the end all be all of explaining performance/load testing so I can't complain.

Three stars.

Wednesday, December 3, 2008

And that job is toast!

Well. Got laid off today with 20 other folks from the Dallas area. Cannot say that I am surprised. Good thing I've been saving up for this possibility. Didn't get to keep the Uberlaptop of Powah (and I wasn't gonna pay the $3700 to keep it).

I don't expect to get another job in December but I have some possibilities lined up in January. What am I gonna do with that time? Hmmmm. Wasn't GTA IV just released? Too bad I don't have a machine capable of playing the PC version and I'm not going to pay for a console while laid off just to play GTA IV.

I have no doubt that my load testing adventure will continue in January.

One thing for sure. The past seven months has been a waste. Thanks a lot, VT. Same back atcha.