One question that came up when using this dataset was how varied the quality of data collection is. To investigate this, we have focused on sources of missing data and ultimately found that the greatest variability in data quality could be found in the precision of taxonomic classifications present in the dataset. The following dashboard shows how taxonomic precision varies from region to region as well as how this precision has changed over time in each region. This is summarized by assigning a score to each observation based on the most precise taxonomy present (genus gets a score of 1, species gets a score of 2, and subspecies gets a score of 2.5).
Altogether, these plots indicate that the precision of mosquito taxonomy does indeed vary by region and over time (though the temporal trends appear to be exclusively seasonal). Alaska appears to have significantly lower precision than other regions while the Northeastern Coast appears to have much higher precision.
A note: actual data missingness (missing taxonomy altogether or missing counts) were associated with all subject specific data to be missing (apart from collection time and collection location). Looking into the data documentation, I couldn’t find an explanation for this (perhaps this is the result of past data entries being corrected with a shell of the data remaining).