Azure OpenAI includes several types of model:
- GPT-4 models are the latest generation of generative pretrained (GPT) models that can generate natural language and code completions based on natural language prompts.
- GPT 3.5 models can generate natural language and code completions based on natural language prompts. In particular, GPT-35-turbo models are optimized for chat-based interactions and work well in most generative AI scenarios.
- Embeddings models convert text into numeric vectors, and are useful in language analytics scenarios such as comparing text sources for similarities.
- DALL-E models are used to generate images based on natural language prompts. Currently, DALL-E models are in preview. DALL-E models aren't listed in the Azure OpenAI Studio interface and don't need to be explicitly deployed.
Create an Azure OpenAI Service resource in Azure CLI:
az cognitiveservices account create \
-n MyOpenAIResource \
-g OAIResourceGroup \
-l eastus \
--kind OpenAI \
--sku s0 \
--subscription subscriptionID
Deploy using Azure CLI:
az cognitiveservices account deployment create \
-g OAIResourceGroup \
-n MyOpenAIResource \
--deployment-name MyModel \
--model-name gpt-35-turbo \
--model-version "0301" \
--model-format OpenAI \
--sku-name "Standard" \
--sku-capacity 1
Prompt types:
Prompts can be grouped into types of requests based on task.
Task type | Prompt example | Completion example |
---|---|---|
Classifying content | Tweet: I enjoyed the trip. Sentiment: | Positive |
Generating new content | List ways of traveling | 1. Bike 2. Car ... |
Holding a conversation | A friendly AI assistant | See examples |
Transformation (translation and symbol conversion) | English: Hello French: | bonjour |
Summarizing content | Provide a summary of the content {text} | The content shares methods of machine learning. |
Picking up where you left off | One way to grow tomatoes | is to plant seeds. |
Giving factual responses | How many moons does Earth have? | One |
Completions Playground parameters:
here are many parameters that you can adjust to change the performance of your model:
- Temperature: Controls randomness. Lowering the temperature means that the model produces more repetitive and deterministic responses. Increasing the temperature results in more unexpected or creative responses. Try adjusting temperature or Top P but not both.
- Max length (tokens): Set a limit on the number of tokens per model response. The API supports a maximum of 4000 tokens shared between the prompt (including system message, examples, message history, and user query) and the model's response. One token is roughly four characters for typical English text.
- Stop sequences: Make responses stop at a desired point, such as the end of a sentence or list. Specify up to four sequences where the model will stop generating further tokens in a response. The returned text won't contain the stop sequence.
- Top probabilities (Top P): Similar to temperature, this controls randomness but uses a different method. Lowering Top P narrows the model’s token selection to likelier tokens. Increasing Top P lets the model choose from tokens with both high and low likelihood. Try adjusting temperature or Top P but not both.
- Frequency penalty: Reduce the chance of repeating a token proportionally based on how often it has appeared in the text so far. This decreases the likelihood of repeating the exact same text in a response.
- Presence penalty: Reduce the chance of repeating any token that has appeared in the text at all so far. This increases the likelihood of introducing new topics in a response.
- Pre-response text: Insert text after the user’s input and before the model’s response. This can help prepare the model for a response.
- Post-response text: Insert text after the model’s generated response to encourage further user input, as when modeling a conversation.
Available endpoints:
- Completion - model takes an input prompt, and generates one or more predicted completions. You'll see this playground in the studio, but won't be covered in depth in this module.
- ChatCompletion - model takes input in the form of a chat conversation (where roles are specified with the message they send), and the next chat completion is generated.
- Embeddings - model takes input and returns a vector representation of that input.
Use Azure OpenAI REST API:
curl "https://eastus.api.cognitive.microsoft.com/openai/deployments/MyModel/chat/completions?api-version=2024-02-15-preview" -H "Content-Type: application/json" -H "api-key: 240ac7db42fb486686ef1e5088af77ae" -d '{ "messages": [{"role":"system","content":"You are an AI assistant that helps people find information."},{"role":"user","content":"hi what is today"}], "max_tokens": 800, "temperature": 0.7, "frequency_penalty": 0, "presence_penalty": 0, "top_p": 0.95, "stop": null }'
Use Azure OpenAI SDK c#:
//install-package Azure.AI.OpenAI -prerelease
//dotnet add package Azure.AI.OpenAI --prerelease
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Azure;
using Azure.AI.OpenAI;
namespace ConsoleApp2
{
internal class Program
{
static void Main(string[] args)
{
string endpoint = "https://eastus.api.cognitive.microsoft.com/";
//"<YOUR_ENDPOINT_NAME>";
string key = "240ac7db42fb486686ef2e5088af87de";
//"<YOUR_API_KEY>";
string deploymentName = "MyModel"; //"<YOUR_DEPLOYMENT_NAME>";
OpenAIClient client = new OpenAIClient(new Uri(endpoint),
new AzureKeyCredential(key));
// Build completion options object
ChatCompletionsOptions chatCompletionsOptions =
new ChatCompletionsOptions()
{
Messages = {
new ChatRequestSystemMessage("You are a helpful AI bot."),
new ChatRequestUserMessage("What is Azure OpenAI?"),
},
DeploymentName = deploymentName
};
// Send request to Azure OpenAI model
ChatCompletions response =
client.GetChatCompletions(chatCompletionsOptions);
// Print the response
string completion = response.Choices[0].Message.Content;
Console.WriteLine("Response: " + completion + "\n");
Console.ReadKey();
}
}
}
Semantic Kernel SDK in C#:
Semantic Kernel integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Developers can create "plugins" to interface with the LLMs and perform all sorts of tasks. The Semantic Kernel SDK also provides built-in plugins that quickly enhance an application. Developers can easily apply AI models in their own applications without having to learn the intricacies of each model's API.
Semantic Kernel integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Developers can create "plugins" to interface with the LLMs and perform all sorts of tasks. The Semantic Kernel SDK also provides built-in plugins that quickly enhance an application. Developers can easily apply AI models in their own applications without having to learn the intricacies of each model's API.
//dotnet add package Azure.AI.OpenAI --prerelease
//dotnet add package Microsoft.SemanticKernel --version 1.2.0
//install-package Azure.AI.OpenAI -prerelease
//install-package Microsoft.SemanticKernel -version 1.2.0
using Microsoft.SemanticKernel;
using System;
using System.Threading.Tasks;
namespace ConsoleApp1
{
internal class Program
{
static async Task Main(string[] args)
{
var builder = Kernel.CreateBuilder();
builder.Services.AddAzureOpenAIChatCompletion(
"gpt35turbo16kdemo",
"https://rg1openailab.openai.azure.com/",
"e9742dc35fd441a38492fecb1f3c8e3d",
"gpt-35-turbo-16k");
var kernel = builder.Build();
System.Net.ServicePointManager
.ServerCertificateValidationCallback = (senderX, certificate, chain,
sslPolicyErrors) => { return true; };
var result = await kernel.InvokePromptAsync(
"Give me a list of breakfast foods");
Console.WriteLine(result);
Console.ReadLine();
}
}
}
Semantic Kernel SDK in C# - Built-in plugins:
The Semantic Kernel SDK offers an extra package with predefined plugins for common tasks. These are available in the
Plugins.Core
package:ConversationSummaryPlugin
- Summarizes conversationFileIOPlugin
- Reads and writes to the filesystemHttpPlugin
- Makes requests to HTTP endpointsMathPlugin
- Performs mathematical operationsTextPlugin
- Performs text manipulationTimePlugin
- Gets time and date informationWaitPlugin
- Pauses execution for a specified amount of time
//dotnet add package Azure.AI.OpenAI --prerelease
//install-package Azure.AI.OpenAI -prerelease
//dotnet add package Microsoft.SemanticKernel --version 1.2.0
//install-package Microsoft.SemanticKernel -version 1.2.0
//dotnet add package Microsoft.SemanticKernel.Plugins.Core --version 1.2.0-alpha
//install-package Microsoft.SemanticKernel.Plugins.Core -version 1.2.0-alpha
using Microsoft.SemanticKernel;
using System;
using System.Threading.Tasks;
using Microsoft.SemanticKernel.Plugins.Core;
namespace ConsoleApp1
{
internal class Program
{
static async Task Main(string[] args)
{
var builder = Kernel.CreateBuilder();
builder.Services.AddAzureOpenAIChatCompletion(
"gpt35turbo16kdemo",
"https://rg1openailab.openai.azure.com/",
"f9742dc25fd441a38492fecb1f3c8e3d",
"gpt-35-turbo-16k");
builder.Plugins.AddFromType<TimePlugin>();
var kernel = builder.Build();
var currentDay = await kernel.InvokeAsync("TimePlugin",
"DayOfWeek");
Console.WriteLine(currentDay);
System.Net.ServicePointManager
.ServerCertificateValidationCallback = (senderX, certificate, chain,
sslPolicyErrors) => { return true; };
var result = await kernel.InvokePromptAsync(
"Give me a list of breakfast foods");
Console.WriteLine(result);
Console.ReadLine();
}
}
}