erich – Page 6

Finally, An Eames Chair That Will Fit In My Office

June 25, 2013 erichLeave a comment

Kevin Spencer recreated the classic chair at 1:20 scale with a 3D printer.

Music is Language

June 15, 2013 erichLeave a comment

“[T]he most important thing about language is its capacity for generating imagined communities, building in effect particular solidarities.” Benedict Anderson

Cloud and Assembly Lines – Choose the Right Model

June 12, 2013 erichLeave a comment

I’m at Red Hat Summit this week talking about cloud with customers and partners, and it occurs to me one of the common metaphors isn’t quite right. The problem with the “Assembly Line” metaphor is everyone thinks of 1907 Ford (“any color you want, as long as it’s black”). And that’s actually a lousy example. There was zero flexibility in product output and the only automation beyond individual parts was the well-defined hand-off during assembly. Don’t underestimate the power of those elements, but that’s nothing compared to what we can do today.

The right model is Chevrolet’s model: build knowing the products you need tomorrow are different from the ones you need today. Build knowing you will change your process while it’s still running. It’s no wonder that once implemented, Chevy beat industry-leader Ford to market by a full year while continuing to serve their current customers and took the lion’s share of the entire car market .

If your cloud isn’t open and changeable, your competitors will out innovate you and take your market.

{ photo from excellent slide show on 100 years of assembly lines at Chevrolet and GM: http://www.assemblymag.com/articles/89625-100-years-of-chevrolet-assembly-lines }

[ update: corrected link to Red Hat Summit keynote streaming ]

Generating Reports in R – Suggestions?

June 8, 2013 erichLeave a comment

I would like to programmatically generate a report using R. The contents are mostly graphs and tables. I have a working system, but it’s too many pieces. When I hand this off to someone else, it become immediately fragile.

Isn’t there a better way? Here are my elements:

R script: collection of functions to manipulate the data interactively, and with the report
R script: wrapper to the above functions, and calls knit (from the knitr package) function to generate the report
R/LaTeX: report template
bash: script to tie it all together, and clean up leftovers

That’s four languages. Ugly.

Calculating Conditional Entropy in R

May 1, 2013 erichLeave a comment

conditionalEntropy <- function( graph ) {
   # graph is a 2 or 3 column dataframe
   if (ncol(graph) == 2 ) {
      names(graph) <- c("from","to")
      graph$weight <- 1
   } else if (ncol(graph) == 3)
      names(graph) <- c("from","to","weight")
   max <- length(rle(paste(graph$from, graph$to))$values)
   total <- sum(graph$weight)
   entropy <- data.frame(H = 0, Hmax = 0);
   entropy$H <- sum(graph$weight/total * log(graph$weight/total) / log(2)) * -1
   entropy$Hmax <- log(max * (max-1))/log(2)
   return(entropy)
}

Analyzing Cloud Performance with CloudForms and R

April 15, 2013 erichLeave a comment

CloudForms by Red Hat has extensive reporting and predictive analysis built into the product. But what if you already have a reporting engine? Or want to do analysis not already built into the system? This project was created as an example of using Cloud Forms with external reporting tools (our example uses R). Take special care that you can miss context to the data, as there is a lot of state built into the product, and for guaranteed correctness, use the builtin “integrate” functionality.

Both the data collection and the analyses are fast for what they are, but aren’t particularly quick. Be patient: calculating the CPU confidence intervals of 73,000 values across 120 systems took about 90 seconds (elapsed time) on a 2011 laptop.

Required R libraries
forecast
DBI
RPostgreSQL
Installing RPostgreSQL required postgresql-devel rpm on my Fedora 14 box

See: collect.R for example to get started. Full code is available on github.

Notes on confidence intervals
Confidence intervals are the “strength” of likelihood # a value with fall within a given range. The 80% confidence interval is the set of values expected to fall within the range 80% of the time. It is a smaller range than the 95% interval, and should be considered more likely. E.g. if are going to hit your memory threshold within the 80% interval, look to address those limits before those that only fall within the 95% interval.

Notes on frequencies
Frequencies within the functions included are multiples of collected data. Short term metrics are collected at 20 second intervals. Rollup
metrics are 1 hour intervals. Example: for 1 minute intervals with short term metrics, use frequency of 3.

Notes of fields
These are column names from the CF db. The default field is cpu_usage_rate_average. I also recommend looking at mem_usage_absolute_average.

Notes on graphs
Graphs for the systems are shown for the first X systems (up to “max”) with sufficient data to perform the analysis (# of data points > frequency * 2) and that have a range of data, e.g. min < max. Red point = min, blue point = max.

Example images
*.raw.png are generated from the short term metrics. The others from the rollup data.