Free Sample Data Sources to Learn and Practice Power BI
Power BI is a powerful business intelligence tool that helps organizations visualize their data and make data-driven decisions. However, learning Power BI can be daunting, especially if you don't have access to real-world data to practice with.
Fortunately, there are many sources of sample data that you can download and use for free to sharpen your Power BI skills. In this post, I shared great sources for sample data that you can use to learn and practice Power BI. Whether you're a beginner or an advanced user, these resources will help you develop your skills, test your scenarios, create your sample reports, and become a Power BI pro.
Sample SQL Databases
For practice and learning, you can download and connect some SQL sample databases to Power BI. These links provide SQL sample databases that can be restored to SQL Server or Azure SQL Database using the instructions provided.
Click on the links below to download them: AdventureWorks sample databases
Wide World Importers sample database Northwind and pubs sample databases
Microsoft Contoso BI Demo Dataset for Retail Industry
The Contoso Data Generator tool from SQLBI can also be used to generate sample Contoso databases on SQL Server based on a star schema. Link to download the Contoso Data Generator: https://www.sqlbi.com/tools/contoso-data-generator/
You can download over 20 accessible SQL sample databases from the SQL SKILL, SQL tutorial, and DATABASE START websites. These sample databases can be used to learn or to reproduce demos.
Click here to download them: https://www.sqlskills.com/sql-server-resources/sql-server-demos/
MySQL: Employees Sample Database: https://dev.mysql.com/doc/employee/en/
Samples PBIX files
The Microsoft website provides six Power BI reports with ready-to-use datasets and sample visuals that are very helpful for training and other purposes. The .pbix files are designed to be used with Power BI Desktop. Here is a link to the files: Download the original sample Power BI files
Each of the above PBIX files is also available as an Excel workbook. You can download the Excel workbooks and use them as data sources in the Power BI service or Power BI Desktop. Download one or all of the sample Excel files from this GitHub repository
Data Stories Gallery - Microsoft Power BI Community
There are many sample reports on the Microsoft Power BI Community website in the Data Stories Gallery section where you can download the PBIX files like these items:
Goleno Pizza Dashboard Meat, to eat or not to eat? Narrative statistics Estimate Electricity Energy Usage and Cost Property developer's sales analytics
Kaggle is a popular platform for data science and machine learning competitions. They have a large collection of sample datasets that you can use for Power BI practice. You can find the datasets at the following link: https://www.kaggle.com/datasets Data sources can be found on the Kaggle dataset web page using filters such as CSV, SQLite, and JSON format. You can also set the filesize for each source:
Currently, this site has more than 112,700 data sources available (95,592 CSV, 16,953 JSON, 172 SQlite files)
The data.world is a social network for data people. They have a collection of open data sets that you can use for Power BI practice. You can find the datasets at the following link: https://data.world/datasets/open-data
The site provides access to more than 128,600 data sets. The site allows you to set some filters to find the data source that has the data you are looking for, for instance, you can find good data sources for Power BI maps.
Power BI Desktop has a certified connector for this site, all you need is the Owner's name and Dataset ID, which can be found on the site.
Connect your file directly to the data source by opening the data source page and setting the connector as shown below:
Google Dataset Search
Google Dataset Search is a search engine designed to help researchers, analysts, and data enthusiasts discover datasets that are freely available on the internet. The platform indexes datasets from a wide range of sources including government agencies, research institutions, and non-profit organizations. Users can search for datasets by entering keywords related to their research interests, as well as filter results by file type, data format, or topic.
To search for files on Google Dataset Search, you can use the following steps:
Go to the Google Dataset Search website. https://datasetsearch.research.google.com/
In the search bar, type in the keywords related to the dataset you are looking for (e.g. "crime rates", "sales data", etc.).
To filter your search results to only show Excel or tabular format files, click on the "Download Format" dropdown menu and select "Tabular" from the list.
You can further refine your search by selecting relevant filters, such as data source, publication date, or geographic location.
Once you have found a dataset you are interested in, click on the title to view more details about the dataset and download the Excel file if it is available.
Read this article for more info about this search engine:
Data.gov is the home of the US government's open data. You can find data sets related to various topics such as agriculture, climate, energy, health, and more. You can access the data sets at the following link: https://catalog.data.gov/dataset/
Set some filters on the right-side panel to find CSV (21,272 files), Excel (7,079 files), JSON (14,462 files), or Text format. You can also set some filters on Topics, Categories, Organizations, Publishers, and Tags.
Sample Data with longitude and latitude for Map
World Cities Database: This website offers a variety of datasets with geographical data such as maps, city, county, and zip code data, as well as location-based datasets like store locations and population statistics. They offer both free and paid datasets in various formats, including CSV, Excel, JSON, and SQL. The website also provides tools for creating custom maps and visualizations based on their datasets. Many of their datasets include latitude and longitude information, which can be used to create maps in Power BI or other data visualization tools. The dataset is available in CSV format from the following link: https://simplemaps.com/data
OpenFlights Airports Database: This dataset contains information on airports around the world, including latitude and longitude information for each airport. The dataset is available in CSV format from the following link: https://openflights.org/data.html
SGS Earthquake Data: This dataset contains information on earthquakes around the world, including latitude and longitude information for each event. The dataset is available in CSV format from the following link: https://earthquake.usgs.gov/earthquakes/feed/v1.0/csv.php
Global Power Plant Database: This dataset contains information on power plants around the world, including latitude and longitude information for each plant. The dataset is available in CSV format from the following link: https://datasets.wri.org/dataset/globalpowerplantdatabase
World Heritage Sites: This dataset contains information on UNESCO World Heritage Sites around the world, including latitude and longitude information for each site. The dataset is available for download from the following link: https://whc.unesco.org/en/list/xls/
Open Data Network is a collection of open data sets from various sources. You can find data sets related to various topics such as demographics, transportation, environment, health, Finance and more. You can access the data sets at the following link: https://www.opendatanetwork.com/search?
There are more than 10,000 data sources available on this site. World Bank Open Data provides free and open access to global development data. You can find data sets related to various topics such as poverty, education, health, climate change, and more. You can access the data sets at the following link: https://databank.worldbank.org/home.aspx
Gapminder is a non-profit organization that promotes sustainable global development. They have a collection of data sets related to various topics such as income, health, education, and more. You can access the data sets at the following link: https://www.gapminder.org/data/
The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that can be used for machine learning research. Some of the datasets can be used for Power BI practice as well. You can find the repository at the following link: http://archive.ics.uci.edu/ml/index.php
FiveThirtyEight is a website that specializes in opinion poll analysis, politics, economics, and sports blogging. It was launched in 2008 and is now owned by ESPN. In addition to opinion polls, FiveThirtyEight also publishes articles and visualizations on various topics, including data-driven journalism, statistical analysis, and data visualization. The website is known for its accurate and insightful predictions during elections and other major events. FiveThirtyEight provides access to a wide range of datasets and data-driven articles that can be used for data analysis and visualization practice in Power BI, link: https://data.fivethirtyeight.com/
The Chicago Data Portal is a website that provides access to datasets published by the City of Chicago. The portal includes a wide range of datasets on topics such as crime, public health, transportation, the environment, and more. The datasets can be downloaded in various formats, including Excel, CSV, and JSON, and can be used for data analysis and visualization in tools like Power BI, Link: https://data.cityofchicago.org/
Data.gov.uk is a website that offers access to datasets published by the UK government and its agencies, encompassing diverse topics such as education, environment, health, and crime. The portal provides a variety of data formats for download, such as CSV, XLS, and JSON, Link: https://www.data.gov.uk/search?q=
The Australian Government Open Data website is a platform that provides access to datasets published by the Australian government, spanning various fields such as environment, economy, and social data. The website offers several data formats for download, allowing users to analyze and visualize data using tools like Power BI, Link: https://data.gov.au/search?format=csv&format=xlsx Awesome Public Datasets is a curated list of high-quality datasets that are freely available on the internet. The project was created to help researchers, analysts, and developers find relevant datasets for their work, and includes a wide range of data types such as text, images, audio, and video. The datasets listed on Awesome Public Datasets are sourced from various domains including government agencies, academic institutions, non-profit organizations, and private companies. The list is maintained by the open-source community on GitHub, and users are encouraged to contribute new datasets or suggest improvements to the existing list. The project is a valuable resource for anyone looking for high-quality and diverse datasets to use for research, analysis, or development purposes. Link: https://github.com/awesomedata/awesome-public-datasets
Microsoft Research Open Data is a data repository that makes available datasets that researchers at Microsoft have created and published in conjunction with their research. You can browse available datasets and either download them, Link: https://msropendata.com/datasets?domain=COMPUTER%20SCIENCE&filetypes=csv
If you have additional datasets that have not been included in this blog post, please share them. Should you have any information that would be of benefit to the community, please do not hesitate to provide your insights and recommendations in the comment section below. Your contributions are greatly appreciated 🙏