Summary of results of Extreme Index analysis of UNC HTTP response size data. Output files are in the web directory: http://www.unc.edu/depts/statistics/postscript/papers/marron/NetworkData/ExtremalIndex/ of the form: UNC2001RS?ExtInd1T##ssss***.ps where: ? is either one or two (depending on the calling program, not relevant here) ## indexes the time block: 11: Thursday Afternoon (peak time) 19: Sunday morning (off peak time) sss reflects the variable being studied: siz: response size, in bytes tim: response time (duration), in seconds rat: average rate of response transmission, just siz/tim, in bytes/sec irat: inverse rate, tim/siz, sec/byte. This is useful for studying rates in contexts where times are large. *** shows how the data have been truncated: nothing means that all response with both a nonzero size, and also a nonzero time (duration) are included in the analysis. 10k means that only responses with size > 10k are included. 100k means that only responses with size > 100k are included. Here is some discussion of the results. Recall that the Extremal Index being studied is essentially "1 / expected number of times extrema occur together". So an extremal index of 0.5 suggests that extrema occur in pairs. First the peak time of Thursday afternoon is studied in detail: 1. Response sizes: UNC2001RS2ExtInd1T11siz.ps UNC2001RS2ExtInd1T11siz10k.ps UNC2001RS1ExtInd1T11siz100k.ps The first two are rather similar, with a fat peak whose height is about 0.9, suggesting that peaks in sizes do not tend to cluster. The third has a somehwat lower height. 2. Response times (durations) UNC2001RS2ExtInd1T11tim.ps UNC2001RS2ExtInd1T11tim10k.ps UNC2001RS1ExtInd1T11tim100k.ps Here the peak heights are smaller, but only slightly so, compared to the above time plots. However, very noticeably, is that for smaller threshold proportions (see the upper left plots), these fall off very substantially, suggesting that the largest response times tend to occur together (and much more frequently than the corrseponding sizes). 3. Rates (size / time) UNC2001RS2ExtInd1T11rat.ps UNC2001RS2ExtInd1T11rat10k.ps UNC2001RS1ExtInd1T11rat100k.ps Here the amount of clustering seems to depend strongly on the threshold. Any ideas as to why? 4. Inverse Rates (time/size) UNC2001RS2ExtInd1T11irat.ps UNC2001RS2ExtInd1T11irat10k.ps UNC2001RS1ExtInd1T11irat100k.ps Here the jump from no thresholding to 10k shows a similar peak height, but the largest values tend to cluster. But at 100k, this effect disappears, but the general peak goes down to only 2/3. Note that generally rate follows size (as expected, since large sizes drive rate), and inverse rate follows time (again driven by large time responses). Sunday mornings: the big picture lessons are similar: a. sizes and rates tend to suggest no clustering. b. times and inverse rates do feel clustering influences. c. the effect in (b) is strongest for the highest thresholds (i.e. the largest values). d. differences become smaller when the data are resitricted to 100k. So overall I think the lesson is that sizes and rates tend to see less clustering than times and inverse rates. This effect is strongest at the highest thresholds....