How to Download Kaggle Datasets on Ubuntu
Kaggle is one of the most popular place to datasets for data science and machine learning. In Kaggle, you can publish datasets, build models, and collaborate with other scientists and engineers in competitions and win prizes.
In this guide, we discuss how to download datasets in Kaggle on your Ubuntu machine.
On your Ubuntu Machine, ensure you have Python 3 and the package manager
In Kaggle, find the dataset you want to download, and check the name of the dataset and the user
that uploaded the dataset. You can find this in the URL of the dataset
For example, if your dataset is located in
You should also have Kaggle account. If you don’t, create a new account here.
Step 1 - Download Kaggle API
Kaggle has a command-line API that can be installed using
1 pip install --user kaggle
pip will install Kaggle API and any required dependencies to your machine.
Step 2 - Setup API Credentials
Navigate to the Accounts page of Kaggle at
https://www.kaggle.com/<USER_NAME>/account. Go to the
“API” section and select the “Create New API Token”. This will trigger the download of
kaggle.json, a file that contains your API credentials. The JSON has a single line of below
Make a directory
.kaggle at root
~, and place
kaggle.json in that directory.
1 2 mkdir ~/.kaggle mv kaggle.json ~/.kaggle
You can verify that the JSON was saved correctly by printing it using the
1 cat ~/.kaggle/kaggle.json
For safety, edit the file permission to ensure that other users cannot read this file. You can
chmod command to change the permission:
1 chmod 600 ~/.kaggle/kaggle.json
Step 3 - Download Dataset
Now, you can download the dataset using Kaggle’s
kaggle datasets download API. Navigate to the
directory that you want to download the dataset to. Then, Check the
USER_NAME and the
DATASET_NAME that you noted in Prerequisites section of this tutorial, and
paste it in the below template:
1 kaggle datasets download <USER_NAME>/<DATASET_NAME>
For example, if your dataset came from
should execute the following line:
1 kaggle datasets download Cornell-University/arxiv
Kaggle API will display a progress bar and start downloading the dataset. Depending on the dataset size and your internet connection, you will have to wait a few seconds to a few hours to download the dataset.
1 2 Downloading arxiv.zip to ~ 100%|██████████████████████████████████████████████████████████████| 877M/877M [00:28<00:00, 32.4MB/s]
Now that you have the dataset downloaded, you have many options to explore the data. Try using Jupyter Notebook with Pandas for exploratory data analysis (EDA).