While reading my colleague François Paget’s recent blog about detection numbers, I noticed that something about the graph illustrating the growth of the collection maintained by AV-Test.org seemed a bit odd.

AV-Test.org total collection size by unique samples

The last few months showed a bigger total size than indicated by the forecast line, which is an exponential function. By looking more closely at the statistics of monthly growth we can see why:

AV-Test.org collection monthly growth rate by unique samples

During the last couple of months there is no longer an increase in the number of new samples added. The growth is no longer exponential but linear, averaging around 600,000 samples added each month. Looking at our own numbers of new samples, I can confirm this new linear growth.

Why is this a big deal? For years the security industry has been fighting an uphill battle–with the number of new samples increasing every month at an alarming rate. Now with constant, though still massive, growth there is some light at the end of the tunnel. If this trend keeps up, planning for future resources and technologies will become much easier and more manageable.

I’ll add one more remark about counting by “unique samples,” in which unique means the file has got a cryptographic hash different from all other files in the collection: For the time being this is one useful way of counting, but it can’t be mapped to detection numbers (François explained why) and it works today only because most new samples are Trojans. Should we see more file-infecting viruses in the future, and there are some indications they will make a comeback, this way of counting will quickly become useless.