Google’s customized search engine for ‘scientists, data journalists, and data geeks’ is now out of beta, and offers indexed searches for almost 25 million datasets. Dataset Search now has added filters so you can look for specific types of dataset, or only those that are free from the provider.
Dataset Search was first released in a beta version in September 2018, and aims to make it easier to search online open-access data. Until now, the problem of finding such data is that it doesn’t necessarily show up in a standard search even though many government departments and academic institutions publish their data online.
Google Dataset Search relies on institutions adding open-source metadata tags that Dataset Search then uses to index the data sets. Dataset Search works in a similar way to Google Scholar, which can be used to search academic papers for data.
The metatags are indexed by Dataset Search and combined with input from Google’s Knowledge Graph, which is what shows as an infobox next to search results to make the results more useful. Google collects and links this information, analyzes where different versions of the same dataset might be, and finds publications that may be describing or discussing the dataset.
The main improvement to the updated version is the ability to filter the results based on the types of dataset that you want, such as tables, images, or text, or on whether the dataset is available for free from the provider. If a dataset is about a geographic area, you can see the map.
The search also now works on mobile devices, and the dataset descriptions have been “significantly improved”. The developers say that until now, the most popular queries include “education,” “weather,” “cancer,” “crime,” “soccer,” and “dogs”, so all you cat lovers out there need to up your searching.