Skip to content

TexasDigitalLibrary/python-dvuploader-templates-associated-scripts

 
 

Repository files navigation

README

Metadata

  • Version: 1.0.0
  • Released: 2025/12/05
  • Author(s): Leah Everitt (Health Sciences Library and Informatics Center, University of New Mexico)
  • Contributor(s): Bryan Gee (UT Libraries, University of Texas at Austin)
  • License: MIT
  • README last updated: 2026/03/12

Purpose

This repository contains helper scripts and templates for working with Python-DVUploader (a Python client for the Dataverse API). It was developed by Leah Everitt (University of New Mexico) as part of the Data Services Continuing Professional Education program with support from Bryan Gee and Michael Shensky (UT Austin). This repository's functions include:

  • Creating test files for upload validation.
  • Printing directory structure before uploading.
  • Generating a JSON config file consumed by uploader scripts.
  • Uploading files or directories to a Dataverse dataset.

Contents

File Purpose
create-fake-directory-files.py Generates 1,000 fake CSV files in a target directory for testing upload workflows.
Print-Directories-oswalk.py Walks a directory and prints its subdirectory and file structure. Useful for previewing what will be uploaded.
template-config-file-creator-Python-DVUploader.py Builds a config.json file containing a Dataverse target and list of files to upload.
template-DirectoryUpload-Python-DVUploader.py Uploads all files from a given directory to a Dataverse dataset using the dvuploader package.
template-FileUpload-Python-DVUploader.py Uploads one file (with optional metadata) to a Dataverse dataset using the dvuploader package.
template-Python-DVUploader-config.json Example config.json structure for use with dvuploader config-driven upload scripts.
template-Python-DVUploader-oswalk.py Walks a directory, prints its structure, and uploads all found files to a Dataverse dataset.

Requirements

Python

  • Python 3.8+ is recommended.

Dependencies

Install the required package via pip:

python -m pip install dvuploader

⚠️ The scripts that upload to Dataverse require a valid Dataverse instance URL (https://dataverse.tdl.org/), an API token with upload permissions, and an existing dataset DOI (Digital Object Identifier).

API token

All TDR users can obtain an API token through the web interface. Details can be found here. Tokens are good for 1 year and should not be shared.

How to Use

1) (Optional) Generate sample files for testing

Edit create-fake-directory-files.py and set:

  • output_directory to the directory where you'd like the fake files created.

Then run:

python create-fake-directory-files.py

2) Preview a directory before uploading

Edit Print-Directories-oswalk.py and set:

  • start_directory to the directory you want to inspect.

Then run:

python Print-Directories-oswalk.py

3) Create a config.json file for bulk uploads

Edit template-config-file-creator-Python-DVUploader.py and set:

  • start_directory to the directory containing the files you want to upload.
  • persistent_id to the dataset DOI (e.g., doi:10.18738/T8/XXXXX).
  • dataverse_url to your Dataverse base URL.
  • api_token to your Dataverse API token.

Then run:

python template-config-file-creator-Python-DVUploader.py

This will create a config.json file (in the current working directory) with the list of files to upload.

4) Upload a directory of files

Edit template-DirectoryUpload-Python-DVUploader.py and set:

  • The directory path inside dv.add_directory(...).
  • DV_URL, API_TOKEN, and PID.

Then run:

python template-DirectoryUpload-Python-DVUploader.py

5) Upload a single file with custom metadata

Edit template-FileUpload-Python-DVUploader.py and set:

  • filepath to the file you want to upload.
  • Optional fields such as tab_ingest, directory_label, description, mimetype, categories, and restrict.
  • DV_URL, API_TOKEN, and PID.

Then run:

python template-FileUpload-Python-DVUploader.py

6) Walk a directory and upload all found files

Edit template-Python-DVUploader-oswalk.py and set:

  • start_directory to the folder you want to traverse.
  • DV_URL, API_TOKEN, and PID.

Then run:

python template-Python-DVUploader-oswalk.py

Config file fields

The config file used by these scripts supports the following fields:

  • persistent_id: Dataverse dataset DOI (e.g., doi:10.18738/T8/XXXXX).
  • dataverse_url: Base Dataverse URL (https://dataverse.tdl.org/).
  • api_token: Your Dataverse API token.
  • files: List of file objects to upload.
    • filepath (required): Absolute or relative path to the file.
    • description (optional): A description for the uploaded file.
    • mimetype (optional): MIME type of the file (e.g., text/csv).
    • categories (optional): List of categories (e.g., ["Data"]).
    • restrict (optional): Boolean (true/false) to restrict access to the file.
    • tabIngest (optional): Boolean indicating whether to enable tabular ingestion.

License

This repository is licensed under the MIT license.

About

This repository contains python script templates and associated scripts to go with the Python-DVUploader package, created for the Texas Data Reposiotry.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%