- Version: 1.0.0
- Released: 2025/12/05
- Author(s): Leah Everitt (Health Sciences Library and Informatics Center, University of New Mexico)
- Contributor(s): Bryan Gee (UT Libraries, University of Texas at Austin)
- License: MIT
- README last updated: 2026/03/12
This repository contains helper scripts and templates for working with Python-DVUploader (a Python client for the Dataverse API). It was developed by Leah Everitt (University of New Mexico) as part of the Data Services Continuing Professional Education program with support from Bryan Gee and Michael Shensky (UT Austin). This repository's functions include:
- Creating test files for upload validation.
- Printing directory structure before uploading.
- Generating a JSON config file consumed by uploader scripts.
- Uploading files or directories to a Dataverse dataset.
| File | Purpose |
|---|---|
create-fake-directory-files.py |
Generates 1,000 fake CSV files in a target directory for testing upload workflows. |
Print-Directories-oswalk.py |
Walks a directory and prints its subdirectory and file structure. Useful for previewing what will be uploaded. |
template-config-file-creator-Python-DVUploader.py |
Builds a config.json file containing a Dataverse target and list of files to upload. |
template-DirectoryUpload-Python-DVUploader.py |
Uploads all files from a given directory to a Dataverse dataset using the dvuploader package. |
template-FileUpload-Python-DVUploader.py |
Uploads one file (with optional metadata) to a Dataverse dataset using the dvuploader package. |
template-Python-DVUploader-config.json |
Example config.json structure for use with dvuploader config-driven upload scripts. |
template-Python-DVUploader-oswalk.py |
Walks a directory, prints its structure, and uploads all found files to a Dataverse dataset. |
- Python 3.8+ is recommended.
Install the required package via pip:
python -m pip install dvuploader
⚠️ The scripts that upload to Dataverse require a valid Dataverse instance URL (https://dataverse.tdl.org/), an API token with upload permissions, and an existing dataset DOI (Digital Object Identifier).
All TDR users can obtain an API token through the web interface. Details can be found here. Tokens are good for 1 year and should not be shared.
Edit create-fake-directory-files.py and set:
output_directoryto the directory where you'd like the fake files created.
Then run:
python create-fake-directory-files.pyEdit Print-Directories-oswalk.py and set:
start_directoryto the directory you want to inspect.
Then run:
python Print-Directories-oswalk.pyEdit template-config-file-creator-Python-DVUploader.py and set:
start_directoryto the directory containing the files you want to upload.persistent_idto the dataset DOI (e.g.,doi:10.18738/T8/XXXXX).dataverse_urlto your Dataverse base URL.api_tokento your Dataverse API token.
Then run:
python template-config-file-creator-Python-DVUploader.pyThis will create a config.json file (in the current working directory) with the list of files to upload.
Edit template-DirectoryUpload-Python-DVUploader.py and set:
- The directory path inside
dv.add_directory(...). DV_URL,API_TOKEN, andPID.
Then run:
python template-DirectoryUpload-Python-DVUploader.pyEdit template-FileUpload-Python-DVUploader.py and set:
filepathto the file you want to upload.- Optional fields such as
tab_ingest,directory_label,description,mimetype,categories, andrestrict. DV_URL,API_TOKEN, andPID.
Then run:
python template-FileUpload-Python-DVUploader.pyEdit template-Python-DVUploader-oswalk.py and set:
start_directoryto the folder you want to traverse.DV_URL,API_TOKEN, andPID.
Then run:
python template-Python-DVUploader-oswalk.pyThe config file used by these scripts supports the following fields:
persistent_id: Dataverse dataset DOI (e.g.,doi:10.18738/T8/XXXXX).dataverse_url: Base Dataverse URL (https://dataverse.tdl.org/).api_token: Your Dataverse API token.files: List of file objects to upload.filepath(required): Absolute or relative path to the file.description(optional): A description for the uploaded file.mimetype(optional): MIME type of the file (e.g.,text/csv).categories(optional): List of categories (e.g.,["Data"]).restrict(optional): Boolean (true/false) to restrict access to the file.tabIngest(optional): Boolean indicating whether to enable tabular ingestion.
This repository is licensed under the MIT license.