Using OAuth2 token in GCP Storage API with nodejs

You can use short-lived OAuth2 tokens in many Google Cloud API calls, however you may run into some interesting limitations and quirks, more specifically in Google’s nodejs Storage API, for instance to push or pull files from S3 Buckets in your nodejs application.

First of all, let’s start off with a code snippet that illustrates how you can inject the OAuth2 token in your Storage API requests.

import { Storage } from '@google-cloud/storage';
const pushFileToGCP = async (bucketName, localFile, remoteFile, accessToken) => {
    if (accessToken) {
        const interceptors = [];
        interceptors.push({
            request: (requestConfig) => {
                requestConfig.headers = requestConfig.headers || {};
                Object.assign(requestConfig.headers, {Authorization: 'Bearer ' + accessToken});
                return requestConfig;
            }
        });
        const storage = new Storage({ interceptors_: interceptors });
        await storage.bucket(bucketName).upload(localFile, {
            destination: remoteFile,
            resumable: false
        });
    } else {
        const storage = new Storage();
        await storage.bucket(bucketName).upload(localFile, {
            destination: remoteFile
        });
    }
}

There are a few things going on in the above snippet, as well there are a couple of GCP quirks that you need to be aware of.

You may notice the “resumable” flag in the code. By default, Google’s nodejs Storage package uploads larger files in smaller chunks by making several “push” requests. The problem with the package is that when it makes the first HTTP request to upload the first file chunk, it correctly includes the Authorization token. However, for some reason it does not include that token in the consecutive HTTP requests for the additional file chunks. Whether or not this is deliberate or a bug, I am not sure. Pushing the file in one single non-resumable HTTP request seems to work.

POST https://storage.googleapis.com/mybucket/
Request Headers:
  Authorization: Bearer ya29...6kUBpA

One of the side effects to this single push seems that there is a 2GB limit HTTP request on Google’s end, which means files larger than that are simply broken off at this limit. Instead of seeing your 5GB file in the S3 Bucket, you will see only the first 2GB.

Another interesting quirk seems to be different behaviour around Identity when you are running the code inside a GCP Compute Engine, versus anywhere outside of GCP, for instance on-prem or on a different Cloud Vendor.

If you run the code outside of GCP, the OAuth2 token behaviour works as expected. When you run the code on a GCP Compute Engine, the OAuth2 token is simply ignored and the default Identity logic applies, which is determined either from the GOOGLE_APPLICATION_CREDENTIALS environment variable if specified, or the Service Account that is associated to the Compute Engine itself, in which case that Service Account must have access to the S3 Bucket that you are pushing the file into. If the Compute Engine’s Service Account does not have the proper permissions to the S3 Buckets, you may see a message similar to this:

A Not Found error was returned while attempting to retrieve an accesstoken for the Compute Engine built-in service account. This may be because the Compute Engine instance does not have any permission scopes specified: Could not refresh access token: Unsuccessful response status code. Request failed with status code 404

Leave a Reply

Your email address will not be published.