I have a web server with one endpoint deployed on Heroku. Server purpose is to interact with Vertex AI gemini-pro LLM. Localy everything wors perfect because I am using a gCloud CLI auth but I want to implement a service account auth.
Here is the code sample of model init:
import { VertexAI } from "@google-cloud/vertexai";
import { GoogleAuth } from "google-auth-library";
import dotenv from "dotenv";
dotenv.config();
const gAuth = new GoogleAuth({
credentials: {
client_email: process.env.CLIENT_EMAIL,
private_key: process.env.PRIVATE_KEY,
},
});
const authClient = await gAuth.getClient();
const vertex_ai = new VertexAI({
project: process.env.PROJECT_ID,
location: process.env.LOCATION,
googleAuth: authClient,
});
const model = "gemini-pro";
const generativeModel = vertex_ai.preview.getGenerativeModel({
model: model,
generation_config: {
max_output_tokens: 8192,
temperature: 0.8,
top_p: 0.8,
top_k: 5,
},
});
export default generativeModel;
I checked the service acc role - it's "Editor" so the problem is in model init. Every time when trying to reach and endpoint I get the error:
2024-02-05T21:13:23.977555+00:00 app[web.1]: GoogleAuthError:
2024-02-05T21:13:23.977557+00:00 app[web.1]: Unable to authenticate your request
2024-02-05T21:13:23.977576+00:00 app[web.1]: Depending on your run time environment, you can get authentication by
2024-02-05T21:13:23.977576+00:00 app[web.1]: - if in local instance or cloud shell: `!gcloud auth login`
2024-02-05T21:13:23.977576+00:00 app[web.1]: - if in Colab:
2024-02-05T21:13:23.977577+00:00 app[web.1]: -`from google.colab import auth`
2024-02-05T21:13:23.977577+00:00 app[web.1]: -`auth.authenticate_user()`
2024-02-05T21:13:23.977578+00:00 app[web.1]: - if in service account or other: please follow guidance in https://cloud.google.com/docs/authentication
2024-02-05T21:13:23.977579+00:00 app[web.1]: Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
UPD: I found the solution in migrating to using API key:
import { GoogleGenerativeAI } from "@google/generative-ai";
import dotenv from "dotenv";
dotenv.config();
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = "gemini-pro";
const generativeModel = genAI.getGenerativeModel({
model: model,
generation_config: {
max_output_tokens: 8192,
temperature: 0.8,
top_p: 0.8,
top_k: 5,
},
});
export default generativeModel;
This is an example of how you can utilize Gemini AI through Vertex AI. Note that the credentials can be obtained from your Google Cloud account via a .json file. After obtaining the credentials, you can implement them in your app by placing them in the .env file to ensure security. This is a quick example of how you can use it.
https://www.npmjs.com/package/@google-cloud/vertexai
Gemini Pro(Streaming content generation,Streaming chat,Content generation: non-streaming)
Gemini Pro Vision(Providing a Google Cloud Storage image URI, Providing a base64 image string ,Multi-part content with text and video)
Example:
const {
VertexAI,
HarmCategory,
HarmBlockThreshold,
} = require('@google-cloud/vertexai');
const PROJECT_ID = '....';
const LOCATION = 'us-central1';
const GEMINI_PRO_MODEL_NAME = 'gemini-pro';
const GEMINI_PRO_VISION_MODEL_NAME = 'gemini-pro-vision';
const authOptions = {
credentials: {
client_email: '....',
private_key: '.....',
},
};
const vertexAiOptions = {
project: PROJECT_ID,
location: LOCATION,
googleAuthOptions: authOptions,
};
const vertex_ai = new VertexAI(vertexAiOptions);
const safetySettings = [
{
category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
},
];
// generativeModel gemini pro
const generativeModel = vertex_ai.preview.getGenerativeModel({
model: GEMINI_PRO_MODEL_NAME,
safety_settings: safetySettings,
generation_config: {max_output_tokens: 256},
});
// generativeVisionModel gemini pro vision
const generativeVisionModel = vertex_ai.preview.getGenerativeModel({
model: GEMINI_PRO_VISION_MODEL_NAME,
});
// Content generation: non-streaming : Gemini pro :
async function nonStreaming() {
const request = {
contents: [{role: 'user', parts: [{text: 'who is james bond?'}]}],
};
try {
const resp = await generativeModel.generateContent(request);
const data = await resp.response;
const textValue = data.candidates[0].content.parts[0].text;
console.log('nonStreaming', textValue);
} catch (error) {
console.error('An error occurred during content generation:', error);
}
}
nonStreaming();