Timesliced reservoir sampling: a new(?) algorithm for profilers
Briefly

Timesliced reservoir sampling: a new(?) algorithm for profilers
"Random sampling from a stream of events can yield almost as good information as storing all data. For example, performance profilers can use random samples of callstacks to identify slow code effectively."
"Slow code will result in the same callstack being repeated. A random sample of callstacks is more likely to contain those that repeat, thus increasing the chances of identifying the slow code."
"Reservoir sampling is a technique that allows for the selection of random samples from an event stream of unknown length, ensuring that each event has an equal chance of being chosen."
Random sampling allows for efficient extraction of information from an event stream of unknown length. For instance, performance profilers can use random samples of callstacks to identify slow code. This method works because slow code tends to repeat, increasing the likelihood that a random sample will capture it. Reservoir sampling is a common algorithm for selecting random samples from such streams, ensuring that each event has an equal chance of being chosen, even as the stream continues indefinitely.
Read at PythonSpeed
Unable to calculate read time
[
|
]