Uploading Files Using the Acquia DAM API

  • Last updated
  • 2 minute read

Goal

Upload files via the Acquia DAM API.

Overview

Getting assets into Acquia DAM programmatically can help save time, effort, and cost when it comes to creating assets from files in external systems. While there are already many systems that DAM connects to via Native Integrations, you may use a home-grown system or a system that is not yet supported. 

In this article’s example, we will be discussing code that runs in response to a trigger (e.g., a user interaction in a UI or an automated flow based on a criteria in an external system). Remember, the API is just one way to get assets into DAM - It’s important to consider all options:

Notes about Sample Code

The sample code is written in Python using the Requests library. However, these examples can be easily translated into the language and library you are using. When debugging, it’s helpful to use tools like cURL, Postman, or Insomnia to construct your HTTP calls and inspect responses.

  1. Setting up our Authorization Header

    Using the Acquia DAM API Documentation, we will set up our authorization header using a bearer token:

    # Step 1: Setting up our Authorization Header
    AUTH_TOKEN = 'YOUR_TOKEN_HERE'
    auth = {
        'Authorization': f"Bearer {AUTH_TOKEN}"
    }
  2. Determining the Upload Method

    In many cases, using the standard Creating New Assets endpoint will suffice. In the case of large files, we may need to use the Upload File in Chunks workflow. There are a few things to consider when deciding whether to upload in chunks:

    • Smaller files will upload faster using the regular endpoint
    • Larger files will upload faster with the chunked endpoint, especially if chunks are uploaded in parallel
    • Is connection speed and reliability a concern? If so, then uploading in chunks will be beneficial, since those uploads can be paused and resumed per-chunk. If the connection is dropped while using the regular endpoint, the entire file will need to be uploaded again
    • Implementing the chunked endpoint introduces additional complexity into your program

    To handle the chunked workflow, we will extract the logic into its own function and call that function from our main workflow in Step 3. This code snippet shows that files larger than 200MB will be uploaded using the “Upload File in Chunks” endpoints, whereas smaller files will be uploaded using the standard “Creating New Assets” endpoint.

    # Step 2: Determining the Upload Method
    # Note: Using the os library and a local file for example purposes
    FILE_SIZE_CUTOFF = 200_000_000 # 200MB
    
    file_path = 'MY_FILE_PATH'
    file_size = os.path.getsize(file_path)
    form_body = {}
    
    with open(file_path, 'rb') as stream:
        if file_size > FILE_SIZE_CUTOFF:
            form_body['file_id'] = handle_large_file(stream, auth)
        else:
            form_body['file'] = stream
  3. Large File Workflow

    Think of the Large File workflow as work that must be done in addition to the normal workflow, not instead of. In other words, we must upload the large file and then create the asset, whereas for smaller files, we can upload the file and create the asset in the same call. Each chunk must be at least 5MB (except for the final chunk) and no larger than 100MB. For this example, we will use chunks of 50MB.

    Relevant API Endpoints:


    Note that this feature is not enabled by default for all customers. If you receive a 'Chunked uploads are not currently available for this account' error response, please contact support via the Acquia DAM Community to discuss your use case.

    # Step 3: Large File Workflow
    CHUNK_SIZE = 50_000_000 # 50MB
    
    # Input a stream and an authorization header
    # Returns the File ID
    def handle_large_file(file_stream, auth):
        
        # Start the Chunked Upload
        response = requests.post(
            'https://api.widencollective.com/v2/uploads/chunks/start',
            headers=auth
        )
        response.raise_for_status()
    
        session_id = response.json()['session_id']
        chunks = []
        chunk_num = 1
        
        # Upload chunks
        while (chunk := file_stream.read(CHUNK_SIZE)):
            form_body = {
                'session_id': session_id,
                'chunk_number': chunk_num,
                'file': chunk
            }
    
            response = requests.post(
                'https://api.widencollective.com/v2/uploads/chunks/upload',
                headers=auth,
                files=form_body # Sending as multipart/form-data, not JSON!
            )
            response.raise_for_status()
            chunks.append(response.json()['tag'])
            chunk_num += 1
    
        # Conclude the Chunked Upload and return the File ID
        response = requests.post(
            'https://api.widencollective.com/v2/uploads/chunks/complete',
            headers=auth,
            json={
                'session_id': session_id,
                'tags': chunks
            }
        )
        response.raise_for_status()
    
        return response.json()['file_id']
  4. Creating the Asset

    Continuing in our main function, we must set up the call to fully create the asset and map any metadata that we want to pass. An important thing to remember in this call: we must send the call as multipart/form-data, not as JSON.

        # Step 4: Creating the Asset
        form_body['profile'] = 'My Upload Profile'
        form_body['filename'] = 'My_Filename.ext'
    
        response = requests.post(
            'https://api.widencollective.com/v2/uploads',
            headers=auth,
            files=form_body # Sending as multipart/form-data, not JSON!
        )
        response.raise_for_status()

Implementation Notes

The real power of the “Upload in Chunks” endpoint is unlocked when you write your program to upload chunks in parallel. This will enable you to upload your files faster, thus making them available for DAM users sooner. For simplicity, the code example only shows how to upload chunks sequentially. If you would like to implement this feature, you would replace the code block below the comment that reads “# Upload Chunks” with code that handles parallel network requests using your preferred library.

Handling large file uploads can take time, so you may want to consider implementing the file uploads as a background task, freeing the user to complete other work while the file uploads. However, the exact way you would implement this depends on the structure of your application. If you do decide to go this route, you should validate your user’s metadata and upload profile input prior to sending the task to the background.

Full Worked Example

Below is the full worked example.

import requests
import os # Using local file for example purposes only

# Constants and Environment Variables
AUTH_TOKEN = 'YOUR_TOKEN_HERE'
FILE_SIZE_CUTOFF = 200_000_000 # 200MB
CHUNK_SIZE = 50_000_000 # 50MB

# Step 3: Large File Workflow
# Input a stream and an authorization header
# Returns the File ID
def handle_large_file(file_stream, auth):
    
    # Start the Chunked Upload
    response = requests.post(
        'https://api.widencollective.com/v2/uploads/chunks/start',
        headers=auth
    )
    response.raise_for_status()

    session_id = response.json()['session_id']
    chunks = []
    chunk_num = 1
    
    # Upload chunks
    while (chunk := file_stream.read(CHUNK_SIZE)):
        form_body = {
            'session_id': session_id,
            'chunk_number': chunk_num,
            'file': chunk
        }

        response = requests.post(
            'https://api.widencollective.com/v2/uploads/chunks/upload',
            headers=auth,
            files=form_body # Sending as multipart/form-data, not JSON!
        )
        response.raise_for_status()
        chunks.append(response.json()['tag'])
        chunk_num += 1

    # Conclude the Chunked Upload and return the File ID
    response = requests.post(
        'https://api.widencollective.com/v2/uploads/chunks/complete',
        headers=auth,
        json={
            'session_id': session_id,
            'tags': chunks
        }
    )
    response.raise_for_status()

    return response.json()['file_id']

# Step 1: Setting up our Authorization Header
auth = {
    'Authorization': f"Bearer {AUTH_TOKEN}"
}

# Step 2: Determining the Upload Method
# Using the os library and a local file for example purposes only
file_path = 'MY_FILE_PATH'
file_size = os.path.getsize(file_path)
form_body = {}

with open(file_path, 'rb') as stream:
    if file_size > FILE_SIZE_CUTOFF:
        form_body['file_id'] = handle_large_file(stream, auth)
    else:
        form_body['file'] = stream

    # Step 4: Creating the Asset
    form_body['profile'] = 'My Upload Profile'
    form_body['filename'] = 'My_Filename.ext'

    response = requests.post(
        'https://api.widencollective.com/v2/uploads',
        headers=auth,
        files=form_body # Sending as multipart/form-data, not JSON!
    )
    response.raise_for_status()