- This line was added.
- This line was removed.
- Formatting was changed.
Recommended Data Collections
Whether you are looking for research data, or looking to practice your data science skills, below are some data collections we recommend.
|Source||Description||Access and Notes|
|UC Irvine Machine Learning Repository||The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.||Managed and hosted by UC Irvine. Free and open access.|
|Kaggle.com||Kaggle is an online community that hosts competitions for machine learning. Kaggle also holds a collection of datasets, open for public use, although it requires users to create an account. Kaggle was acquired by Google in 2017.||Requires user email address to create an account.|
A weekly data project aimed at the R programming language ecosystem. As this project was borne out of the
Datasets are posted each week and available on the R for Data Science GitHub.
|Free and open access. UCLA Library Data Science Center periodically hosts Tidy Tuesday welcoming all to practice working with data together.|
|UCLA Dataverse||Collection of research data and findings from the UCLA research community.||Free and open access.|
|GitHub Awesome Public Datasets||Curated lists of public datasets on GitHub.||Free and open access.|
|The General Index||The General Index consists of 3 tables derived from 107,233,728 journal articles.||Free and open access.|
Submit a Request