Send Requests to the Deployed LLM
What you'll learn
- Send Request to a Deployed Model via API
Step 1: Navigate to the Deployment Page of the Model
Click on Deployments and under Services find your deployed model. You will find this dashboard:

Step 2: Send API Request
To send an API request to deployed LLM, go to the OpenAPI section, click on Text Generation Inference and then on Generate Tokens.

A sample request body is already populated. Click on Send API Request to send a POST request to the deployed model.

Copy the corresponding Python code for the request.

Here is the generated code snippet:
import requests
url = "https://llama-2-7b-llm-demo.{your-org-domain}/generate"
payload = {
"inputs": "My name is Olivier and I",
"parameters": {
"max_new_tokens": 50,
"repetition_penalty": 1.03,
"return_full_text": False,
"temperature": 0.5,
"top_k": 10,
"top_p": 0.95
}
}
headers = {
"Content-Type": "application/json",
"Accept": "application/json"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())
Updated 10 days ago