Generate the Simpsons and Hunter-Gaston Diversity Indices for a VNTR dataset

A Diversity Index (DI), when applied to VNTR data, is a measure of the variation of the number of repeats at each locus. It can range from zero (no diversity) to one (extreme diversity). i.e. Loci with a similar number of repeats in each sample will have a lower DI, whereas a locus where the number of repeats is different for nearly every sample will have a very high DI.

The Hunter-Gaston estimate of diversity incorporates a finite sample adjustment which is generally desirable. It is very similar to Simpsons when the sample size is above 50 or diversity is less than 90%.

There is often a degree of uncertainty associated with the DI (as with any statistical value), therefore Confidence Intervals (CI) are also generated for each of the loci examined by this tool. A CI gives the precision of the DI by providing the upper and lower boundaries.

A high DI with a narrow CI indicates accurate measurement of a highly variable locus. These loci may be sufficiently variable to be used as an indicator to discriminate between samples or as a starting point for assay developement.

Additional information

Although originally developed for VNTR data, V-DICE can analyse any list of information presented in the CSV format, including MLST and SBT profiles.
Even non-numerical text lists can be analysed however graph axis labels are not optimised for this.

A BioNumerics script is available that exports character data in the appropriate format.
To enable the script, simply download it to your BioNumerics script folder.