Recommended Data Collections

Whether you are looking for research data, or looking to practice your data science skills, below are some data collections we recommend.



SourceDescriptionAccess and Notes
UC Irvine Machine Learning RepositoryThe UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms.Managed and hosted by UC Irvine. Free and open access.
Kaggle.comKaggle is an online community that hosts competitions for machine learning. Kaggle also holds a collection of datasets, open for public use, although it requires users to create an account. Kaggle was acquired by Google in 2017.Requires user email address to create an account.
Tidy Tuesday

A weekly data project aimed at the R programming language ecosystem. As this project was borne out of the R4DS Online Learning Community and the R for Data Science textbook, an emphasis was placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2tidyrdplyr, and other tools in the tidyverse ecosystem.


Datasets are posted each week and available on the R for Data Science GitHub.

Free and open access. UCLA Library Data Science Center periodically hosts Tidy Tuesday welcoming all to practice working with data together.
UCLA DataverseCollection of research data and findings from the UCLA research community.Free and open access.
GitHub Awesome Public DatasetsCurated lists of public datasets on GitHub.Free and open access.








Additional Information

Data Science Center 

Submit a Request




This service is available to list the users that can use this service here.

Place a checkmark next to the status that this service is available to below. To place a checkmark, type in the following: (/)

All
Library Staff(tick) *
UCLA(tick) *
Campus-Affiliated
Other

Once you have placed your checkmark, add that status as a label to your page. For example, if you marked "All," then add the "all" label to this page.

If this service is available to all, then remove this expand macro. This expand box should only appear if this service is limited in availability.

If this service is not available to you, and you have questions, please send an email to techhelp@library.ucla.edu for more information.







Please visit the Data Science Center
portal to raise a request. Select the icon below, and make it a link to the same portal as you did before.




Includes the JIRA Service Desk (JSD) portal field from the excel spreadsheet. Please add your text here.

Enter keywords from the excel spreadsheet as labels.