Hello! Recently I have been looking at the 2016 index card data and the 2017 online data on the Language Maps and Language Clouds blog. While comparing the data from the two posts I realized that there are benefits and drawbacks to each set. The 2016 index card data has a larger focus on the proficiency of the language. It also gives insight into what context the language may have been used in, which can be insightful when mapping the origins of the language. The 2017 online data has a larger focus on the residences and how the participants heard about that language.
However, the 2016 index card data set does not go into depth about the region: where they may have picked up on the language. On the other hand, the 2017 online data set does not have a quantifiable way to measure how proficient a person is in that language. They merely wrote a bit about the language. I personally prefer the 2016 index card data set. It gives insight into region, proficiency, and context in an organized and quantifiable manner. Furthermore, I noticed that most of the languages in the data were English, Spanish, French, German, and Italian. It made me wonder if the abundance of these languages is caused by the fact that all these languages are taught in high schools in the United States or if there is a higher density of people in this area originating from the countries that speak those languages. It is something that could be investigated in the future.