The rise of prompt engineering

I have said this before – with the advent of large AI models, Prompt Engineering is critical and is the next challenge for us to master.

What is Prompt engineering?

Prompt engineering is the process of fine-tuning large models and often are written in natural language, outlining the intention of the user. Prompt engineering is a key element that allows the output to be accurate and reflect the needs of the user. Prompts should not be thought as the explicit one input to the model, instead are multiple tasks for the model.

We use large language models (#LLM) such as #GPT3, or #Text2Image models like #DALLE and #StableFusion using a text prompt. The prompt is a string and is our way to ask the model to do what it is meant to. It also is our way to provide hints and directions on what you need and ultimately help the model understand the patterns that are important for us and be represented in the output.

The way we write a prompt is important – including the phrases, orders of the words, hints, etc. Prompts also need to be in the context of the use-case (see screen shot below on GPT3 use case examples). For example, language generation prompts would be different from code generation or summarization, or image generation. The prompts are closely tied to the intended use cases.

Screenshot showing GPT3 example use cases.
GPT3 use case examples

Examples of prompt engineering

We start out with a couple of examples related to language generation. I figured what better way to show prompt engineering, by asking GPT3 about prompt engineering. 😇

In this first screenshot below, we use GPT3’s davinci model and ask a paragraph on prompt engineering. The first sentence is the prompt that was the input, and the text with the green background is what was generated.

Screenshot of the generated output of a GPT3 model
GPT3 screen shot showing a paragraph prompt

And in this second example, it is mostly the same prompt but we ask for a blog post instead of a paragraph. As we can see the output of course is quite different, but the essence of it is still quite the same.

Screenshot of the generated output of a GPT3 model
GPT3 screen shot showing a blog post prompt

And finally another example, same as before, but in this case we outline that be for a 5 year old child (ignoring the fact would a 5 year old understand the notion of AI, and models 😶).

Screenshot of the generated output of a GPT3 model
GPT3 screen shot showing a paragraph prompt for a child to understand

Even though the changes might seem subtle in the examples shown earlier – consider them as toy examples.

Small changes to the prompt can lead to significant changes on the output. To show an example, below are two examples #StableDiffusion – which is a open source image-to-text model. I used Harry Potter for inspiration and use Hogwarts and the dark forest where the first graders were forbidden to go.

For the first prompt example: a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, Diya Lamp architecture, atmospheric, sense of awe and scale

And for the second example, the prompts was: a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, atmospheric, sense of awe and scale.

The only difference between the two prompts was removing “Diya Lamp architecture“, resulting in dramatically different outputs. I am guessing this being image generation, the changes are more dramatic and easier to comprehend.

Prompts also are not universal and are very dependent to the models being used – what is considered a good example in one model (from one institution), won’t transpose to another model from another institution. For example the same prompt as above (a beautiful view of hogwarts school of witchcraft and wizardry and the dark forest, by Laurie Lipton, Impressionist Mosaic, atmospheric, sense of awe and scale), when used for OpenAI’s DALLE model generates the image shown below – which is very different of course.

And if I want to tweak the same prompt specifically for DALLE here is another example using the prompt: Beautiful view of Hogwarts school of witchcraft and wizardry and the dark forest with a sense of awe and scale, Awesome, Highly Detailed.

As a side note, I particularly like this one:

This also has created a number of tools that allow us to craft prompts. Given many of us don’t quite understand the options, and styles that can go in there. Some like promptoMANIA can cover multiple large models (images in this case) and can get very sophisticated themselves. And other simpler ones like this DALLE prompt generator by Adam Brown, and more like prompts.ai allow for tweaking and fine-tuning of prompts and effectively creating templates for GPT3.

Prompt engineering is a brand new and fascinating space for the industry and I for one am quite intrigued to see where it will lead us.

Nuget packages not found after installing Visual Studio 2022

I recently needed to install Visual Studio 2022 on one my existing machines to debug a new zeroshot model that has a dependency on our Speech SDK. The Speech SDK is one of our key #AI services in Cognitive Services (as part of #AzureAI). I already had VSCode running, but in this case I need the bigger brother.

After installing Visual Studio, I could not get any nuget packages to install; I could not even fetch anything and didn’t matter what I used – the package manager console in Visual Studio, PowerShell, etc.

Nuget would only point to the local offline package store, which in my case is available at: C:\Program Files (x86)\Microsoft SDKs\NuGetPackages\. I kept getting the error: Package X is not found on source 'C:\Program Files (x86)\Microsoft SDKs\NuGetPackages\'

I could not understand the behavior, and not seen this before. Turns out it is a known issue, albeit not very common. Seems like if either PowerShell’s Install-Module, or Chocolatey’s coco command was used before using nuget or Visual Studio for the first time, this will happen.

The solution is to add nuget.org as a package source, and the URL it should point to is: https://api.nuget.org/v3/index.json. Below is the screenshot on what this looks in Visual Studio for me:

Screenshot showing updated nutget package sources in Visual Studio 2022
Updated nuget package sources

If one is setting up a brand new dev box, the odds of you running into this is low; in my case given I was adding Visual Studio 2022 much later is what caused this. There is more details on this here too.

Podman error on Ubuntu – short-name did not resolve to an alias and no unqualified-search registries

I recently installed Ubuntu on one of the Pi’s are home and installed Podman – which I hadn’t heard of until recently and is a container engine, similar to docker but doesn’t have a daemon.

When trying to get a basic alpine test image running I got this error:

Error: error creating build container: short-name "python:3.7-alpine" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf"
Podam-compose error

This is because, shortnames it seems arent resolved by default – atleast not on the the Ubuntu (ARM) version. To fix this, the following needs to be added to the /etc/containers/registries.conf file:

unqualified-search-registries=["docker.io"]
Updating the registries.conf file

And once you save, these trying podcam-compose up should work as expected.

podman-compose up

How to run TeslaMate on Azure

If you have a Tesla, then you should absolutely check out TeslaMate which is data logger for your car(s) that one self-hosts. This uses the car’s API and gets all different kinds of telemetry of your drives, charging, batter conditions, acceleration, braking, parking, etc. I personally prefer this, over other online services (of which there are a few) – as it is giving away the keys to the kingdom – literally in this case (the Tokens used to authenticate and login).

I have been running TeslaMate at home on a couple of machines for a while and figured a cloud version would work out better. I had network issues on one of the machines, where no car telemetry was downloaded. It was a few days before I realized that the machine wasn’t online due to a separate DNS issue and those few days’ worth of car telemetry (drives and other data of course) wasn’t recorded.

In our example, we will deploy TeslaMate in a docker container running on Ubuntu – which is hosted on Azure. To help with isolation and managing this, I would recommend we use a resource group (RG) only for running TeslaMate. Of course, we need an Azure subscription, which I would assume you already have.

If you are not familiar with TeslaMate, before we get started here, I would highly recommend checking out the features, including some screenshots and the installation documentation to get an idea.

Web Interface
TeslaMate Overview (Credit: TeslaMate GitHub repro)

Step 1 – Creating new RG

We start by logging into the Azure portal and create a new resource group (RG) for TeslaMate; if you are not sure how to do this, the documentation here outlines the steps needed. Once you have an empty RG, it would look something like the screenshot below.

New Azure Resource Group

Step 2 – Creating new Ubuntu VM

Now that we have a new RG, we need to create a new Ubuntu virtual machine (VM) in that. We will choose the option to create resources as shown in the middle of the screen (see previous screenshot).

Clicking on “Create resources”, we see various menu options; the options you see might be a little different than the one shown below.

Create a resource – Azure

We need to create a Virtual Machine – the first choice under “Popular Azure services” and will click the “Create” link. This starts a wizard that allows you to go through the various settings and options.

The first step when creating a VM is to start with the basic details for machine we are creating – instance details, subscription details, admin user details, etc. I outline the steps and show screenshots to help those who are not comfortable with this level of tech, or new to Azure. If you are a more advanced user, a more efficient way would be via the Azure CLI. You can read up more details on VM’s on Azure here.

Step 3 – VM Basics

  • Subscription and resource group – Make sure you have the correct subscription and RG selected. If you haven’t created a new RG yet, you can do so using the “Create new” link under the RG option (see the screenshot below).
  • VM Name – You can give the VM any name – this is more for you to remember and manage.
  • Region – In terms of a region, in most cases it would make sense to pick a region that is physically close to the same area where you are based (and the car too of course).
  • Image – I use the latest Ubuntu LTS image which as of this writing is v20.04 Gen 2.
Create a new VM in Azure
  • Size – In terms of picking the size for the VM – we don’t need a very beefy machine, and needless to say – the bigger the machine, the higher the monthly costs. I keep the standard Size. This is not my main instance as I already have that running – this new instance is being setup as a demo that I will be deleting later.
  • Username – This is obvious and should be something you know and can remember.
  • Password – I choose password as the auth type, more so as this is for demo purposes for this post; ideally ssh keys are more secure and you would want to use that. If you do go down a password path, I cannot stress enough not to reuse passwords and create a strong password; it is always a good idea to use a password manager (e.g., I use LastPass).
Basic details required when setting up a new VM

I chose the simple SSD option; we don’t need a lot of advanced things.

Disk details when setting up a new VM

For the network options, you do want a public IP and, in most cases, just leaving the default would work. And I don’t show it in the screenshot, but we don’t need a load balancer and leave the default option of “None”. And we do want the ability to ssh into the machine to deploy and manage TeslaMate.

Networking details when setting up a new VM

For the Inbound port rules, by default only port 22 is enabled for SSH; to allow us to access the web server we also need to both ports 80 (http) and 443 (https) are enabled as shown in the screenshot below.

Inbound port rules – when setting up a new VM

For the next set of Tabs (Management, Advanced, and Tags) I didn’t change anything and went with the defaults. Once the validations are passed, and the final review shows the cost and other details you choose.

VM Creation – summary
VM Creation – summary
VM Creation – summary

And once you are happy with everything click the Create button on the bottom left corner.

VM Creation confirmation

Once the deployment of the VM starts, it can take a few minutes and you will see a similar progress as shown below.

VM deployment status screen

Once the VM is created, deployed and wired up (which can take a few minutes) – we will see the confirmation as shown below.

VM deployment confirmation

From the confirmation screen, clicking on “Go to resource” takes us to a screen where we see the different details of the VM. One of the details we are interested in at this point is the IP address and the ability to give the machine a DNS name. We need these to be able to connect to the VM over SSH (see screenshot below).

VM essential details

It might be worthwhile to also setup a DNS name that one can use in addition to the IP. This DNS name would be the fully qualified domain name (FQDN) that would be needed later when configuring the docker container. The DNS name allows us to connect to the machine using something like “https://my-car-details.cloudapp.azure.com (or similar). You can read more details on FQDN in the Azure docs here. If you are interested in using your own DNS server, you can read details on how to go about that here.

Click on the “Not Configured” for the DNS name (as shown in the image below) and you can set a unique name that is something memorable.

VM DNS name

The DNS name is tied to the region you have, and it must be unique.

VM DNS name label creating

And once this is setup, you can see the FQDN in your VM details as shown below.

VM DNS name

If for some reason you didn’t open ports 80 and 443 earlier, you can always configure them now. To do so, in the Azure portal, when you have the Ubuntu VM resource selected, click on Networking on the left, and you can update the Inbound port rules.

VM networking setting
VM adding inbound network rules

When you add both ports (you would need to give them unique names and priority orders), and the final results would look something like the screenshot shown below.

VM network inbound port rules

And finally, we can ssh into that machine using the credentials and the IP we configured earlier. This can be done using ssh (e.g. ssh user-name@IP-address of the machine).

Step 4 – Install Docker

The first thing we need to do once we ssh into the machine is to update the various packages installed. The first time you run this, it will take a few minutes. You do this by running the following commands.

# I prefer to run these separately - to get a handle on what is getting updated.
sudo apt update
sudo apt upgrade

# You can of course run them together if that is what you prefer
sudo apt-get update

This is pretty standard and should not cause any issues; below is the screenshot showing the output – there are too many packages being updated for me to show everything.

Installing docker on our Ubuntu VM isn’t terribly complex – the docker docs outline all the steps and the details. We will want to install from the repro and follow the steps outlined and be mindful of specific versions and drivers.

We setup the repository, and for that need to install the following prerequisites.

sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

In my case, with the latest Ubuntu image in Azure, we already had these:

Next we add docker’s GPG key using the following command:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

The output isn’t dramatic in case you were wondering.

Next we add docker’s repro to Ubuntu – this will allow us to find and install the packages.

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

At this time, we should run apt-get update command to update the newly added repository. We should check to ensure that docker is going to be installed from the docker repository, and not Ubuntu’s default. To do this we run the following command.

apt-cache policy docker-ce

This shows us that docker isn’t installed, but the candidate for installation is from docker.com and is for “focal” – which is the release name of Ubuntu v20.04. The list we see is long because it outlines the different versions of docker.

Now, we are finally ready to install docker using the following command and also choosing Yes on the prompts that confirm the installation.

sudo apt-get install docker-ce docker-ce-cli containerd.io

Once complete, you will see something like the output shown below.

At this time, docker should be installed running. We can also check its daemon is configured to run on booting up.

Whilst not needed, it is good practice to add the current username to the docker group created – this will ensure we don’t need to use “sudo” for every docker command. And using the groups command we can validate our current username (“amit” in my case) is in the docker group.

sudo usermod -aG docker ${USER}
su - ${USER} # this allows us to add the user without logging out

Woot! We have docker running.

The first thing one should do with any new docker installation is to run its equivalent of Hello World. This is done using the following command – which downloads a test image and runs it in a container, prints a message, and then exists the container – so a full life cycle.

sudo docker run hello-world

And yay, we validated that docker is up and running on our VM! Congratulations.

Before we get to configuring TeslaMate, we also need to install docker-compose, which is a tool that allows us to run multi-container docker applications (such as TeslaMate). We will install docker-compose using the following command with the result of that command shown after that.

sudo apt install docker-compose

Step 5 – Configure TeslaMate

Given we will be exposing TeslaMate to the internet directly we should not use the default TeslaMate docker installation, but the advanced version which uses Traefik as a proxy server and helps us secure the web server better and only expose the (Grafana) dashboards behind an authentication mechanism.

For this we will create a new folder for TeslaMate which will contain not only the docker compose file needed but other relevant configuration details. I like to keep this in a folder, to help manage – in this case it resides in ~/docker/teslamate

It is in this folder we will create the docker-compose yaml file that is needed; you would want to start with the one outlined in the TeslaMate instructions and tweak it for your needs.

This file needs to be called docker-compose.yml and my example is shared below. It is a good idea to always get the latest yaml file from TeslaMate’s docs – over time we would expect things will evolve and the file below might not be accurate down the road.

version: "3"

services:
  teslamate:
    image: teslamate/teslamate:latest
    restart: always
    depends_on:
      - database
    environment:
      - DATABASE_USER=${TM_DB_USER}
      - DATABASE_PASS=${TM_DB_PASS}
      - DATABASE_NAME=${TM_DB_NAME}
      - DATABASE_HOST=database
      - MQTT_HOST=mosquitto
      - VIRTUAL_HOST=${FQDN_TM}
      - CHECK_ORIGIN=true
      - TZ=${TM_TZ}
    volumes:
      - ./import:/opt/app/import
    labels:
      - "traefik.enable=true"
      - "traefik.port=4000"
      - "traefik.http.middlewares.redirect.redirectscheme.scheme=https"
      - "traefik.http.middlewares.teslamate-auth.basicauth.realm=teslamate"
      - "traefik.http.middlewares.teslamate-auth.basicauth.usersfile=/auth/.htpasswd"
      - "traefik.http.routers.teslamate-insecure.rule=Host(`${FQDN_TM}`)"
      - "traefik.http.routers.teslamate-insecure.middlewares=redirect"
      - "traefik.http.routers.teslamate-ws.rule=Host(`${FQDN_TM}`) && Path(`/live/websocket`)"
      - "traefik.http.routers.teslamate-ws.entrypoints=websecure"
      - "traefik.http.routers.teslamate-ws.tls"
      - "traefik.http.routers.teslamate.rule=Host(`${FQDN_TM}`)"
      - "traefik.http.routers.teslamate.middlewares=teslamate-auth"
      - "traefik.http.routers.teslamate.entrypoints=websecure"
      - "traefik.http.routers.teslamate.tls.certresolver=tmhttpchallenge"
    cap_drop:
      - all

  database:
    image: postgres:13
    restart: always
    environment:
      - POSTGRES_USER=${TM_DB_USER}
      - POSTGRES_PASSWORD=${TM_DB_PASS}
      - POSTGRES_DB=${TM_DB_NAME}
    volumes:
      - teslamate-db:/var/lib/postgresql/data

  grafana:
    image: teslamate/grafana:latest
    restart: always
    environment:
      - DATABASE_USER=${TM_DB_USER}
      - DATABASE_PASS=${TM_DB_PASS}
      - DATABASE_NAME=${TM_DB_NAME}
      - DATABASE_HOST=database
      - GRAFANA_PASSWD=${GRAFANA_PW}
      - GF_SECURITY_ADMIN_USER=${GRAFANA_USER}
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PW}
      - GF_AUTH_ANONYMOUS_ENABLED=false
      - GF_SERVER_DOMAIN=${FQDN_TM}
      - GF_SERVER_ROOT_URL=%(protocol)s://%(domain)s/grafana
      - GF_SERVER_SERVE_FROM_SUB_PATH=true

    volumes:
      - teslamate-grafana-data:/var/lib/grafana
    labels:
      - "traefik.enable=true"
      - "traefik.port=3000"
      - "traefik.http.middlewares.redirect.redirectscheme.scheme=https"
      - "traefik.http.routers.grafana-insecure.rule=Host(`${FQDN_TM}`)"
      - "traefik.http.routers.grafana-insecure.middlewares=redirect"
      - "traefik.http.routers.grafana.rule=Path(`/grafana`) || PathPrefix(`/grafana/`)"
      - "traefik.http.routers.grafana.entrypoints=websecure"
      - "traefik.http.routers.grafana.tls.certresolver=tmhttpchallenge"

  mosquitto:
    image: eclipse-mosquitto:2
    restart: always
    command: mosquitto -c /mosquitto-no-auth.conf
    ports:
      - 127.0.0.1:1883:1883
    volumes:
      - mosquitto-conf:/mosquitto/config
      - mosquitto-data:/mosquitto/data

  proxy:
    image: traefik:v2.4
    restart: always
    command:
      - "--global.sendAnonymousUsage=false"
      - "--providers.docker"
      - "--providers.docker.exposedByDefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.tmhttpchallenge.acme.httpchallenge=true"
      - "--certificatesresolvers.tmhttpchallenge.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.tmhttpchallenge.acme.email=${LETSENCRYPT_EMAIL}"
      - "--certificatesresolvers.tmhttpchallenge.acme.storage=/etc/acme/acme.json"
    ports:
      - 80:80
      - 443:443
    volumes:
      - ./.htpasswd:/auth/.htpasswd
      - ./acme/:/etc/acme/
      - /var/run/docker.sock:/var/run/docker.sock:ro

volumes:
  teslamate-db:
  teslamate-grafana-data:
  mosquitto-conf:
  mosquitto-data:
Screenshot showing the docker-compose.yml file

Next we need to create a .env file. The environmental secrets (i.e. passwords) are not saved in the yaml file but are stored are stored in this .env file which we will create next.

We will enter the DNS name as the FQDN that you setup earlier for the VM; update the TM_TZ for the time-zone you are based out of. This is the TZ database name, and if you aren’t sure what it should be for your time-zone, check out the details here.

Like the yaml file, you should get the latest .env file template from TeslaMate, as the one shown below might change over time.

TM_DB_USER=teslamate
TM_DB_PASS=secret
TM_DB_NAME=teslamate

GRAFANA_USER=admin
GRAFANA_PW=admin

FQDN_TM=teslamate.example.com

TM_TZ=Europe/Berlin

LETSENCRYPT_EMAIL=yourperson@example.com

If you are not sure on how to create a new file in Ubuntu (or any other Linux distro for that matter) – you can use nano editor as shown below. You need to make sure this is in the same folder as where the docker-compose.yml file is (which is ~/docker/teslamate in our example).

Console screenshot showing how to create .env file

And finally, we need to create a .htpasswd file which is used to authenticate the website (see TeslaMate’s documentation for more details). I chose to create this locally after installing Apache tools, but you can also do it online. Note: this is *not* your Tesla login credentials but are the credentials you will use to access the site we are setting up now.

Console screenshot showing installation of Apache Utils

We can create a new file password as shown below. Given we are in the TeslaMate folder, we don’t have to provide a full path for the file.

htpasswd -c .htpasswd amit
Screenshot showing htpasswd usage

So, in the end we should have the following three files in the same folder:

Screenshot showing director listing

Step 6 – Starting TeslaMate

Now we are finally ready to start the docker container for TeslaMate. When this launches, go to the URL for the DNS name you setup, and login using your Tesla credentials. For the first time, I would recommend running the container attached to the console, so if there are any errors or issues you can see them. Normally you would want to run this detached (which is using the “-d“) option.

# don't forget the sudo command
sudo docker-compose up

The first time you run this, it will take a few minutes to pull all the images, and wire things up. During the process you will see the progress for each image in the various container app.

Console showing docker-compose progress

And finally, if everything is setup properly you should see the container running with the logs in the console of your terminal. This is a running log, and the process is active. You will see something like the screenshot below.

Console showing docker-compose logs

If you didn’t setup a DNS name earlier and thought you can try and use the IP name – that unfortunately will fail with the Traefik proxy server and in the logs, you will see an error to that effect. Let’s Encrypt doesn’t issue certificates for IP addresses as a policy.

Now if we browse the URL (also known as the FQDN) you had setup earlier, you will see an authentication challenge. This is great and shows that the proxy server is setup properly and working as expected.

Traefik http authentication

Once you enter the credentials you setup in the .htpasswd file earlier you will be able to login and see the TeslaMate’s Tesla login!

TeslaMate Tesla authentication screen

Congratulations! You have TeslaMate running on Azure.

Step 7 – Finishing up TeslaMate configuration

Now that you have TeslaMate running, you need to login to Tesla. The best way to do this these days is using existing tokens . There are a few ways to do this, and one of the easiest is using this tool – Tesla API Token.

Once you login, go to Settings and change the Dashboards URL – which would be your FQDN with “/grafana” appended. Remember the credentials you use for the dashboards (i.e. Grafana) are the ones you set in the .env file.

TeslaMate URL configuration

Now that everything is up and running, we can kill the docker-compose process, which is attached to the console, and re-run it detached from the console. To do this, go back to the ssh session we have connected to the Ubuntu VM and press CTRL + C. This will stop that container and you will see a similar output as shown below.

Console output

And now you can restart the container with the “-d” option, which is for detached.

sudo docker-compose up -d
docker-compose output

Congratulations, you have TeslaMate running on an Azure host Ubuntu VM via docker.

Below is a screenshot of my instance that has been running for some time.

TeslaMate Overview Dashboard

AI writing AI code🤐

It is 2021. And we have #AI writing #AI code. 🤪 It is quite interesting, but also can be quite boring once you get beyond the initial technology, and just think of it as one of the tools in your arsenal. And getting to that point is a good think.

As part of a think at work I recently started playing with GitHub Copilot, which is using GPT3 to be your pair programmer — helping write code. GPT3 has multiple models (called engines), and Copilot uses one of these family of engines called Codex. Codex is a derivative of the base GPT3 engine that is trained on billions of lines of code.

Using Copilot is quite simple; you install the Github Copilot extension, and it shows up in your IDE (VSCode in my example). We need to make sure we decompose the problem we are trying to solve – we should not think of this as helping write the complete program or all parts; but as it can help with different functions and pieces of code. To do this, we need to tell it what we are trying to do – these are done via prompts (code comments). For GPT models, prompt engineering is quite critical, and would be worth getting to details and understanding.

Starting simple, I create an empty python file and entered a prompt that outlines what I want to try and do. In this case as you can see in the screenshot below – I want to load an image from a file, and using our Vision Cognitive Services, run an image analysis, and auto-generate a caption for that image.

I started typing the definition of a function, and Copilot (via the add-in) understands the prompt I outlined, and the context of the code on what I am doing. Remember Codex builds on the base GPT3 and does have all that NLU capability.

Taking all of this in, it suggests completing the function for me. In terms of using this as an end-user (i.e. the developer) – the suggested code shows up as auto-complete and you can see it in the grey color. If I like that suggestion, I press tab and have it added to the file.

In this case you can see how it is reading the file from disk, calling a function called get_caption() and printing the caption to the stdout (console in this example).

There is also an option to cycle through different suggestions and then pick another one as shown in the screenshot below.

This variant of the suggested code is creating a function called image_caption() which takes the path to the image file to load. This also expects other required things for the Vision cognitive service to work – such as the subscription key to authenticate, the API end-point details to call, etc.

Typically, Copilot can synthesize up to 10 code options (Copilot calls these as Solutions), that one can cycle through and see if there is a better variant for the task at hand. The screenshot below shows this experience in VS Code.

The ask whilst simple, still involves a bit of code which needs to be written – reading from file, setting up the subscription details, and wiring that up to call this etc. And it is in these cases really where Copilot shines – it is your copilot picking up the ‘gunk work’ – freeing up your bandwidth, and cognition capacity on the more interesting and higher order bit of code and value to your business.

To get a flavor of our AI writing AI code, below is the full set of the nine suggestions the Copilot came back for what I was trying to do.

Synthesizing 9/10 solutions

=======

def generate_caption(image_path):
    from azure.cognitiveservices.vision.computervision import ComputerVisionClient
    from msrest.authentication import CognitiveServicesCredentials
    from azure.cognitiveservices.vision.computervision.models import TextOperationStatusCodes
    from azure.cognitiveservices.vision.computervision.models import TextRecognitionMode
    from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
    from msrest.exceptions import CognitiveServicesError
    from os.path import exists
    from os import remove
    from os import makedirs
    from os import listdir
    from os.path import isfile, join
    from time import sleep
    from sys import exit
    from random import randint
    from datetime import datetime

    # set up the connection to the cognitive services account
    subscription_key = "YOUR KEY HERE"
    endpoint = "https://YOUR ENDPOINT HERE"
    group_id = "YOUR GROUP ID HERE"
    client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

    # set up the directory where the images will be downloaded
    download_dir = "./downloads/"
    if not exists(download_dir):
        makedirs(download_dir)

    # set up the directory where the images will be saved
    save_dir = "./images/"
    if not exists(save_dir):
        makedirs(save_dir)

    # set up the directory where the images will be saved
    caption_dir = "./captions/"
    if not exists(caption_dir):
        makedirs(caption_dir)

    # get a list of the files in the download directory
    files = [f for f in listdir(download_dir) if isfile(join(download_dir, f))]
    if len(files) == 0:
        print("No files found in the download directory")
        exit(1)

    # get the first file in the list
    file_name = files[0]
    file

=======

def generate_caption(image_path):
    from os import environ
    from azure.cognitiveservices.vision.computervision import ComputerVisionClient
    from msrest.authentication import CognitiveServicesCredentials
    from azure.cognitiveservices.vision.computervision.models import TextOperationStatusCodes
    from azure.cognitiveservices.vision.computervision.models import TextRecognitionMode
    from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
    from msrest.exceptions import CognitiveServicesError

    # Add your Computer Vision subscription key to your environment variables.
    if 'COMPUTER_VISION_SUBSCRIPTION_KEY' in environ:
        subscription_key = environ['COMPUTER_VISION_SUBSCRIPTION_KEY']
    else:
        print("\nSet the COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.\n**Restart your shell or IDE for changes to take effect.**")
        sys.exit()

    # Add your Computer Vision endpoint to your environment variables.
    if 'COMPUTER_VISION_ENDPOINT' in environ:
        endpoint = environ['COMPUTER_VISION_ENDPOINT']
    else:
        print("\nSet the COMPUTER_VISION_ENDPOINT environment variable.\n**Restart your shell or IDE for changes to take effect.**")
        sys.exit()

    # Create an authenticated Computer Vision client.
    client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

    # Open the local image file.
    with open(image_path, 'rb') as image_file:
        # Read the image into a byte array.
        image_data = image_file.read()

    # Set the features to return.
    features = [VisualFeatureTypes.categories, VisualFeatureTypes.description, VisualFeatureTypes.tags]

    # Analyze the local image.
    try:
        result = client.analyze_image(image=image_data, visual_features=features)

=======

def generate_caption(image_file, subscription_key):
    from azure.cognitiveservices.vision.computervision import ComputerVisionClient
    from msrest.authentication import CognitiveServicesCredentials
    from PIL import Image
    from io import BytesIO

    client = ComputerVisionClient(ComputerVisionClient.SUBSCRIPTION_KEY_ENDPOINT,
                                  CognitiveServicesCredentials(subscription_key))

    image_stream = open(image_file, 'rb')
    image = Image.open(image_stream)
    image_bytes = BytesIO(image)

    analysis = client.analyze_image(image_bytes, ComputerVisionClient.DESCRIPTION)
    return analysis.description.captions[0].text

=======

def generate_caption(image_filename):
    import os
    import json
    from azure.cognitiveservices.vision.computervision import ComputerVisionClient
    from msrest.authentication import CognitiveServicesCredentials
    from PIL import Image
    from io import BytesIO
    import sys
    import requests

    # Set the environment variables for your subscription key and region
    # For example, for US West
    # "COMPUTER_VISION_SUBSCRIPTION_KEY=<subscription key>"
    # "COMPUTER_VISION_ENDPOINT=https://westus2.api.cognitive.microsoft.com/vision/v2.0"

    # For example, for China East
    # "COMPUTER_VISION_SUBSCRIPTION_KEY=<subscription key>"
    # "COMPUTER_VISION_ENDPOINT=https://eastasia.api.cognitive.microsoft.com/vision/v2.0"

    # Set your subscription key and region
    subscription_key = os.environ['COMPUTER_VISION_SUBSCRIPTION_KEY']
    endpoint = os.environ['COMPUTER_VISION_ENDPOINT']

    # Set your image path
    image_path = image_filename

    # Set your region
    region = "eastus"

    # Create an instance of the Computer Vision client
    computer_vision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

    # Load the image from disk
    with open(image_path, 'rb') as image_file:
        image_content = image_file.read()

    # Add the image to the list of images to analyze
    image_list = [image_content]

    # Make the analysis asynchronous
    analyze_result = computer_vision_client.analyze_image_in_stream(image_list, visual_features=['Categories', 'Description', 'Color'])

    # Get the analysis result
    analysis = analyze_result.result()

    # Display the image captions
    print("\nImage captions:")
    for caption in analysis.description.

=======

def generate_caption(image_path):
    # load the image
    image = open(image_path, "rb").read()
    # encode the image into base64
    image = base64.b64encode(image).decode("utf-8")
    # create a dict with the image
    body = {"image": image}
    # create a dict with the subscription key
    headers = {"Ocp-Apim-Subscription-Key": "0d5a5f7f1e534d05a8e8d6f9e6e9c6d3"}
    # create a url for the api
    url = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Description,Tags"
    # send the request
    response = requests.post(url, headers=headers, json=body)
    # get the response
    response.raise_for_status()
    # get the caption
    caption = response.json()["description"]["captions"][0]["text"]
    # return the caption
    return caption

=======

def generate_caption(image_path):
    from os import environ
    from os.path import join
    from urllib.parse import urlparse
    from urllib.parse import parse_qs
    from http.client import HTTPSConnection
    from json import loads
    from base64 import b64encode

    # Load the image into a byte array
    image_data = open(image_path, "rb").read()
    image_data = b64encode(image_data).decode('utf-8')

    # Read the subscription key from the environment variables
    subscription_key = environ["AZURE_SUBSCRIPTION_KEY"]

    # Define the vision API endpoint
    vision_base_url = "https://westus2.api.cognitive.microsoft.com/vision/v2.0/"

    # Set the vision API endpoint to analyze the image
    vision_analyze_url = vision_base_url + "analyze"

    # Set the parameters
    parameters = {'visualFeatures': 'Categories,Description,Color'}

    # Prepare the request
    headers = {'Ocp-Apim-Subscription-Key': subscription_key,
               'Content-Type': 'application/octet-stream'}
    params = parse_qs(urlparse(vision_analyze_url).query)
    params['visualFeatures'] = 'Categories,Description,Color'
    params['language'] = 'en'

    # Send the request
    conn = HTTPSConnection('westus2.api.cognitive.microsoft.com')
    conn.request("POST", vision_analyze_url, image_data, headers)
    response = conn.getresponse()
    data = response.read()

    # Load the JSON response into a Python dictionary
    parsed = loads(data)

    return parsed

=======

def generate_caption(image_path):
    """
    Generate a caption for an image.
    :param image_path: path to an image
    :return: the caption
    """

    # Load the image
    from os.path import isfile
    from PIL import Image
    import requests

    if not isfile(image_path):
        raise FileNotFoundError(image_path)

    image = Image.open(image_path)

    # Encode the image
    image_data = open(image_path, "rb").read()

    # Computer Vision parameters
    params = {'visualFeatures': 'Categories,Description,Color'}

    # Computer Vision API URL
    url = "https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze"

    # Connect to Computer Vision
    headers = {'Ocp-Apim-Subscription-Key': 'd3f3e1b8e7e14d0b9d7d3f8c8f7e0b6f', 'Content-Type': 'application/octet-stream'}
    response = requests.post(url, headers=headers, params=params, data=image_data)
    response.raise_for_status()

    # Load the response as JSON
    analysis = response.json()

    # Extract the word tags
    categories = [(category['name'], category['score'])
                  for category in analysis['categories']]
    print('Tags:')
    for category in categories:
        print(category[0])

    # Extract the description
    descriptions = [(description['text'], description['score'])
                    for description in analysis['description']['captions']]
    print('Descriptions:')
    for description in descriptions:
        print(description[0])

    # Display the color
    try:
        dominant_color = analysis['color']['dominantColorBackground']
    except:
        dominant_color = 'N/A'
    print('Dominant color: {}'.format(dominant_color))

    # Display the caption
    caption = '{}'.format(

=======

def generate_caption(filename):
    import os
    import sys
    import json
    import requests
    import cognitive_services_keys
    import cognitive_services_key_names

    # create the header for the request
    headers = {'Ocp-Apim-Subscription-Key': cognitive_services_keys.get_key(cognitive_services_key_names.BING_KEY)}

    # the URL for the bing endpoint
    bing_url = 'https://api.cognitive.microsoft.com/bing/v5.0/images/search'

    # the parameters for the request
    params = {'q': 'image', 'count': 1, 'offset': 0, 'mkt': 'en-us', 'safeSearch': 'Moderate'}

    # read the contents of the file
    with open(filename, 'rb') as f:
        data = f.read()

    # build the request
    request = requests.post(bing_url, headers=headers, params=params, data=data)

    # check for errors
    if request.status_code != 200:
        print('Error code: {}'.format(request.status_code))
        print(request.text)
        sys.exit(1)

    # process the response
    response = json.loads(request.text)
    if 'value' in response:
        caption = response['value'][0]['name']
        print('The caption is: {}'.format(caption))
    else:
        print('No caption found.')

=======

def generate_caption(image_path):
    import os
    import json
    import requests

    subscription_key = "e9e9c8a8d1d9400e8d0b7f2d8f8e7c6e"

    assert subscription_key

    vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/"

    analyze_url = vision_base_url + "analyze"

    image_data = open(image_path, "rb").read()

    headers    = {'Ocp-Apim-Subscription-Key': subscription_key,
                  'Content-Type': 'application/octet-stream'}
    params     = {'visualFeatures': 'Categories,Description,Color'}
    response = requests.post(
        analyze_url, headers=headers, params=params, data=image_data)
    response.raise_for_status()

    analysis = response.json()
    print(analysis)
    image_caption = analysis["description"]["captions"][0]["text"].capitalize()
    return image_caption

Reinforcement Learning – An Introduction

Reinforcement Learning is teaching by example – it is how most of us learn. Reinforcement Learning (#RL) is a different approach to ML – it is a set of techniques that allows AI algorithms to experiment and learn from experience. RL falls in between supervised and unsupervised learning – there isn’t any labeled data, but at the same time it isn’t unsupervised either. At its most simple form, RL is a computational approach for automating goal-oriented decision making and learning.

Inherent RL is the ability to operate in a dynamic uncertain environment. RL can be more formally defined as the study, science, and problem of intelligence in the form of an agent that interacts in an environment. At the end of the day, almost all RL problems can be formalized as MDP (Markov decision processes).

The problem is represented by an environment – such as a world where an agent is based in. The steps in RL are quite clear – the agent takes actions, that have some effect on the environment. The environment acts on those actions and gives back an observation to the agent – what it sees and senses.

One special signal the environment gives back to the agent is called a reward signal. This signal is what an agent used to figure out how well it is doing. The RL problem is to take actions over time, to maximize the reward signals. And this notion of maximizing is what the agent is learning from the environment, without any explicit supervision. This construct helps an agent achieve a goal, even in an uncertain environment, factoring in delayed and indirect consequences of actions.

Reinforcement Learning Overview
Reinforcement Learning Overview

An agent can have many actions (i.e., choices); it uses a ‘reward’ signal to determine which of those actions is considered ‘good’ vs. ‘bad’. Of course, this determination is in the context of the outcome that we want to achieve.

Some examples of rewards in different industries and use cases:

  • Maneuvering a UAV’s – positive for following a chosen trajectory; negative for deviating from that trajectory.
  • Managing an investment portfolio – positive for each dollar earned; negative for each dollar lost.
  • Controlling a power station – As one can imagine, this control would typically constitute a few things in the environment – a sequence of controls, motors, batteries, power sources, etc. In optimizing the throughput of a power station, we can think of positive rewards for producing power; negative for exceeding a safety threshold.
  • Playing a game – positive for increasing score; negative for decreasing score.

Core concepts that make up RL:

Agent – The ‘thing’ that is using and acting on behalf of a user or another program. This can be a program executing a business process, a embedded process, the arm of a robot, actuators on a self-driving car controlling the wheels, etc.

Policy – A policy outlines how an agent would behave at certain times and can be thought of as the problem we are trying to solve. This is an agent’s behavior function and is a mapping of the business outcome that we are after.

Reward – A reward is a feedback special signal and outlines what is considered good (or bad) and is correlated with the agents’ current action, and the current state of the environment. All goals can be described as to maximize the cumulative reward. The reward is not a binary number but is a scaler between 0 and 1 – with zero being ‘bad’ and one being the best reward attainable for that action.

Value function – A value function represents how good is it to be in a particular state and related actions. Where a reward signal is showing the specification of good in an immediate sense (current step), the value function is representing the notion of good overall. At an abstract level, when thinking about the prediction of rewards, a rewards function is the primary, we can think of value functions as the secondary. In the end, we are more concerned with getting higher-value functions to make decisions, and not as much as higher rewards.

Model – A model is an agent’s view of the environment and mimics its behavior. This allows us to make inferences on how the environment will behave and is often used for planning. Think of the model as the strategy to use in solving the problem at hand.

Taxonomy of RL Algorithms

There are many types of RL algorithms (as we can see in the figure below), but these can broadly be classified in the following two categories.

  • Model free: A model-free algorithm can be thought of as an explicit trial and error algorithm. In a model free approach, the agent doesn’t have or ignores the environment; instead, the agent uses experience and tries to optimize a Policy.
  • Model based: On the other hand, a model-based algorithm reflects how an environment works, and factors that the associated reward functions and tries to maximize that. Technically, this is the optimization of the transition probability distribution of the MDP.

The main difference between the two – in one the algorithm optimizes for the environment, and in the other for a policy gradient. There is no one right or wrong algorithm – a lot of it depends on the situation at hand and what one is trying to optimize for.

As we can see below each of these categories can be further broken down – we won’t go into the details of those quite yet, maybe that is for another post. One of the most important components of most RL algorithms is a method to efficiently estimate values – at the end of the day, this is all about value estimation.

Chart showing the taxonomy of RL algorithms.
Taxonomy of RL Algorithms

Exploration and Exploitation

There are two concepts of exploration, and exploitation which are at odds with each other and for a given situation we should aim to get a balance of some sorts. In simple terms, RL is sequential decision making – one selects actions to maximize future rewards, and we need to plan long term – rewards might be delayed and not immediate, and we cannot be greedy. Sometimes, we need to sacrifice the immediate reward to gain more (or better) longer term rewards.

This can be thought of trial-and-error learning loop – with stream of experiences that constitute loops of actions, rewards, and observation. At the end of the day, this loop is what matters.

Exploration finds more information about the environment, and in doing so gives up rewards. Exploitation on the other hand, exploits the information it already has to maximize rewards. If we don’t exploit, we might be stuck in a sub-optimal place, and how would be know if there is a better sense or rewards without trying?

When we are the trial-and-error loop we might be losing rewards, and the agent needs to discover a good policy to maximize the rewards – this is the tension at the opposite ends of a string pulling each other.

It is important to balance both exploring and exploiting.

GPT-3 vs other AI powered assistants

I been kicking the tires with Open AI’s #GPT-3. Based on the screenshot below, it might be easy to think “oh boy does the model think highly of itself”, but as with most things in life – devil is in the details.😃 The screenshot below was a forked version of davinci engine and follows the Q&A structure.

OpenAI's GPT3 answering questions when compared to other AI powered assistants.
GPT-3 vs other AI assistants

Using OpenAI’s API is quite simple; perhaps too simple! It is quite easy to unleash the beast as the code snippet shown below. If you are new to using GPT3, I would highly recommend you start with the use case model guidelines first.

In the context of a toy example, to get to a simple Q&A chatbot as the screenshot earlier shown is quite simple. The API is powerful, and simple to use, and getting started is easy as the code below shows.

import os
 import openai
 openai.api_key = os.getenv("OPENAI_API_KEY")
 response = openai.Completion.create(
   engine="davinci",
   prompt="I am a highly intelligent question answering bot. If you ask me a question that is rooted in truth, I will give you the answer. If you ask me a question that is nonsense, trickery, or has no clear answer, I will respond with \"Unknown\".\n\nQ: What is human life expectancy in the United States?\nA: Human life expectancy in the United States is 78 years.\n\nQ: Who was president of the United States in 1955?\nA: Dwight D. Eisenhower was president of the United States in 1955.\n\nQ: Which party did he belong to?\nA: He belonged to the Republican Party.\n\nQ: What is the square root of banana?\nA: Unknown\n\",
   temperature=0,
   max_tokens=100,
   top_p=1,
   frequency_penalty=0.0,
   presence_penalty=0.0,
   stop=["\n"]
 )

There are three core concepts when using GPT-3: Prompt, Completion, and Tokens.

To start using the API, we need to start giving it some prompts – this provide some context to the engine on what is expecting. Without the surface area is too broad and we get into nonsensical situations. This is part of the task-specific fine-tuning required.

Think of when giving examples as part of the prompt, we are essentially “programming” the model and providing guidance and providing some hints to context and pattern matching. Note the training data cut off in late 2019, so the model in production today doesn’t have access to data and events post that (e.g., Covid).

Completion is the output that GPT3 generates based on the prompt. To be clear, this is not the full text but is the predicted completions; think of it as “autocomplete” in Word, or Outlook or a search engine. The API has flexibility to return more than one predicted completion along with the probabilities of alternative tokens at each position (to me it seems just like the wave function when thinking of Quantum mechanics 🐼).

Finally, think of Token are the smaller Lego blocks that combine to make words. The API, which is nothing but wrappers around GPT-3 breaks up the text into tokens before processing it. The GPT-3 model understands the statistical relationships between these tokens and uses this to produce the next token in a sequence of tokens.

For example, if we are curious about Tokens, we can see in the screenshot below how the API “tokenizes” this paragraph and get the details of the tokens. This paragraph contains 207 characters and 43 tokens.

Token text that GPT-3 API converts to before using.
GPT-3 Tokens – Text
Token ID's that GPT-3 API converts to before using
GPT-3 Token – IDs

At a high level, think of one token == ~4 characters of text, which is ¾ of a word; so, 100 tokens ~= 75 words.

This is just dipping our toes in the beast that is GPT-3; the API’s which wrap up and expose the engines (more on that in another post) make it simple to use and without getting too much in the weeds of 175 billion parameters. 🙂

ML algorithm cheat sheet

A #ML algorithm cheat sheet – helping narrow down to a certain set of #algorithm grouping depending on the problem at hand and what we are trying to solve from a business perspective.

Cheat sheet showing different #ML algorithms to choose from depending on the task at hand
Figure 1

Figure 2 shows what additional characteristics we need to consider when choosing the right ML algorithm for your situation at hand. This is something that cannot be generic and is very situational.

Flow diagram showing how to select a ML algorithm and additional characteristics we need to consider as we select a ML algorithm
Figure 2 – Characteristics in selecting ML algorithms

If you find this useful, I would also recommend reading “How to select algorithms” which is detailed as part of Azure ML designer.

Auto-update PowerShell and nag-free

If you are like me and get annoyed with the big PowerShell upgrade ‘nag’ ‘reminder’ (see screenshot below); instead of trying to figure out what to download and install the update, there is a simpler way to get the latest update and address the nag. 🙂

bah ree 
powershell 7.e.2 
Copyright Cc) Microsoft Corporation . 
https :// aka. ms/powershell 
Type 'help' to get help. 
All rights reserved. 
A new PowerShell stable release is available: v7.€.3 
Upgrade now, or check out the release page at: 
https :// aka . 9.3 
Loading personal and system profiles took 1€89ms. 
+ ambahree@AMBAHREE-LAPTOP 
) iex 
$Cirm https :// aka.ms/install—powershell.psl) } —UseMSI" 
VERBOSE: About to download package from 'https 
e msi ' 
[€9 : 41]

You can just run the code below in an elevated prompt to get the latest release of PowerShell – it is easy-peasy. 🙂

iex "& { $(irm https://aka.ms/install-powershell.ps1) } -UseMSI"

Changing Window Terminal’s default directory

If you are like me, and don’t really have your work saved in the “%USERPROFILE%” it gets annoying after a time, to keep changing the directory.

If there is one specific folder that you prefer, it is an easy configuration change in the profile setting – add a setting called “startingDirectory” and point it to the path you want.

For example, I have a root folder called “src” where most of the code I am working on sits, and that’s where I wanted to default the terminal to.

To get to the profile, you can either use the shortcut CTRL+, or from the dropdown in the title bar, click settings (see below). This will open the settings.json in your default editor.

Terminal setting

In my case, I wanted the starting directory for all the shells, so I put it under “defaults” – you can choose different options for different shells, and then would have this in the appropriate shell’s settings and not the default block of course.

Below is what this looks like for me pointing this to “c:\src”. Also note, the escape characters need to be formatted properly to parse.

"defaults":
{
    // Put settings here that you want to apply to all profiles.
    "fontFace":  "CaskaydiaCove NF",
    "startingDirectory": "c:\\src",
},
setting.json screenshot

Once you save the file, it should automatically reload the terminal. And if the json didn’t parse – because of a typo or a syntax error then you would see an error similar to the one shown below.

Parsing error

In this example, I set the starting folder as “c:\src”; instead of “c:\\src”.

bfloat16 – how it improves AI chip designs

Floating point calculations are slow for computers (specifically CPUs); possibly representing the same struggle for many humans. 🙂

I remember a time when a FPU (floating point unit) was an upgrade and one had to pay extra to get one. Very useful when you needed that extra precision in computing – and in my head, it always seemed like the Turbo button. 🙂

For most #ML workloads and computations, precision isn’t the most important criteria; with every increasing data and parameters (looking at you GPT-3 with 45 TB of data and 175 billion parameters!), what most ML needs today is speed and dynamic range.

This is where bfloat16 (Brain floating-point format with 16 bits) – a new floating-point format comes handy and in the context of #AI improves on IEEE 754 – the current floating-point arithmetic standard.

As per IEEE 754, a floating point it will always take up 32 bits (see Figure 1 below) – irrespective of the size of the number. The exponent (8 bits) tells us how many numbers we shift (left or right) and place the decimal. The fraction (23 bits), also called the mantissa, holds the actual number – i.e. the data.

Figure 1 – IEEE 754 Floating point representation

bfloat16 truncates the data size in a third (see Figure 2) – with the fraction truncated from 23 to 7 bits. This of course means bfloat16 isn’t as precise. However bfloat16 has the same exponent bits as IEEE-754 it can represent a similar range (small to large), but more importantly are easier to convert between bfloat16 and IEEE 754.

Figure 2 – fbloat16 representation

Less precision doesn’t impact the matrix multiplication as much so in the context of ML training and inference these chips at scale are more efficient – not only they are faster, they also use less power, and memory bandwidth.

What is interesting in some neural nets such as a DNN, these less precision bfloat16 are more precise compared to IEEE 754! This is because the regularization and quantization weights cannot use the finer precision represented by IEEE 754 but adapt better with bfloat16. 🙂

Finally, bfloat16 is not a universal standard (yet); most AI chips support this. ARM, Intel, and, AMD have started adding support for this in their chipsets.

WSL2 +Ubuntu on Window 10 (2004)

One of the key advances in the latest version of Windows 10 (2004) is WSL2 (Windows Subsystem for Linux v2) – and whilst a version bump, it offers so much more. This allows us to run with near-native performance linux binaries (ELF64).

Before we get into the steps outlined to install WSL2, I also recommend installing Windows Terminal, and winget. Although not required, it does make it simpler to use and a better (dev) experience – especially when setting up a new workstation.

For WSL2 to work, you need to make sure you are on Windows 10 2004 Build 19041 or higher. If you don’t have this, run Windows update and see if that updates your OS. If that doesn’t offer a update, you could also try the Windows update assistant.

To get WSL2, whilst not complicated one needs to do the following steps, in this order – running the commands in an elevated prompt.

  1. Enable the Windows Subsystem for Linux optional feature.
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
  1. Enable the Virtual machine platform optional feature.
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
  1. Reboot
  2. Run Windows update (and reboot again if there are updates)
  3. Set WSL2 as your default option.
wsl --set-default-version 2
Administrator: Windows X v 
Administrator: powerf X 
PS C: Bahree> dism.exe /ontine /enabte—feature /featurename:Microsoft—Windows—Subsystem—Linux /att /norestart 
Deployment Image Servicing and Management toot 
Version: 
Image Version: 
Enabling feature(s) 
The operation completed successfully. 
PS C: Bahree> dism.exe /ontine /enabte—feature /featurename:virtualmachineptatform /att /norestart 
Deployment Image Servicing and Management toot 
Version: 
Image Version: 
Enabling feature(s) 
The operation completed successfully. 
PS C: Bahree> dism.exe /ontine /enabte—feature /featurename:virtualmachineptatform /att /norestart 
Deployment Image Servicing and Management toot 
Version: 
Image Version: 
Enabling feature(s) 
The operation completed successfully.
Enabling WSL2
  1. Install your Linux distro of choice. You can do this via Store, or via winget, such as Ubuntu using the following command.
winget install -e --id Canonical.Ubuntu
PS C: Bahree> winget install —e ——id Canonical. Ubuntu 
Found Ubuntu [Canonical . Ubuntu] 
This application is licensed to you by its owner . 
Microsoft is not responsible for, nor does it grant any Licenses to, 
Successfully verified installer hash 
Starting package install... 
100% 
Successfully installed . 
third—party packages .
Installing Ubuntu via winget

Note, when trying to set WSL2 as the default option above (Step 5) and you get a error 0x1bc, that most likely means you need to run Windows update and reboot.

PS C: wsI 
——set—default—version 
2 
Error: exlbc 
For information on key differences with 
WSL 2 please visit https://aka.ms/ws12 
PS C: wsI 
Windows Subsystem for Linux has no installed distributions. 
Distributions can be installed by visiting the Microsoft Store: 
https : / /aka. ms/wslstore
WSL Error

And here is my running Ubuntu and updating it.

Ubuntu 
Installing, this may take a few minutes. . 
Please create a default UNIX user account. The username does not need to match your Windows username. 
For more information visit: https://aka.ms/wslusers 
Enter new UNIX username: amit 
New password: 
Retype new password:
Installing Ubuntu
amitaambahree-laptap: 
$ sudo apt updaze sudo apt upgrade 
lamlcgamoanree-±apcop:•-• 
[sudo] password for amit: 
1 http://security.ubuntu . com/ubuntu focal-security InRe1ease [187 ka] 
et:2 http://archive.ubuntu . com/ubuntu focal InRe1ease [265 ka] 
http://security.ubuntu . com/ubuntu focal -security/main amd64 Packages [147 ka] 
http://archive.ubuntu . com/ubuntu focal -updates InRe1ease [111 ka] 
http://archive.ubuntu . com/ubuntu focal -backports InRe1ease [98.3 ka] 
http://security.ubuntu . com/ubuntu focal-security/main Translation-en [51.8 ka] 
http://security.ubuntu . com/ubuntu focal-security/main amd64 c-n-f Metadata [3432 B] 
http://security.ubuntu . com/ubuntu focal -security/restricted amd64 Packages [28.9 ka] 
http://security.ubuntu . com/ubuntu focal-security/restricted Translation-en [7664 B] 
http://security.ubuntu . com/ubuntu focal-security/restricted amd64 c-n-f Metadata [324 B] 
http://security.ubuntu . com/ubuntu focal-security/universe amd64 Packages [42.8 ka] 
http://security.ubuntu . com/ubuntu focal-security/universe Translation-en [22.6 ka] 
http://security.ubuntu.com/ubuntu focal-security/universe amd64 c-n-f Metadata [1768 B] 
http://security.ubuntu . com/ubuntu focal-security/multiverse amd64 Packages [1172 B] 
http://archive.ubuntu . com/ubuntu focal/main amd64 Packages [978 ka] 
http://security.ubuntu . com/ubuntu focal-security/multiverse Translation-en [548 B] 
http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 c-n-f Metadata [116 B] 
Get : 
Get : 3 
Get . 
Get . 
et. 
Set : 8 
Get . 
• 18 
• 11 
Get . 
• 12 
• 13 
iGet : 14 
Get . 
et. 
• 16 
Get . 
• 17 
et. 
• 18 
et. 
• 19 
Get . 
28 
Get . 
• 21 
et. 
• 22 
et. 
• 23 
Get . 
• 24 
Get . 
• 25 
Get . 
• 26 
et. 
• 27 
et. 
• 28 'Ittp• 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
http:/'archive.ubuntu . com/ubuntu 
. / 'archive. ubuntu . com/ubuntu 
focal/main Translation -en [SB6 ka] 
focal/main amd64 c-n-f Metadata [29. S ka] 
focal/universe amd64 Packages [8628 ka] 
focal/universe Translation-en [5124 ka] 
focal/universe amd64 c-n-f Metadata [265 ka] 
focal/multiverse amd64 Packages [144 ka] 
focal/multiverse Translation-en [184 ka] 
focal/multiverse amd64 c-n-f Metadata [9136 B] 
focal -updates/main amd64 Packages [312 ka] 
focal -updates/main Translation-en [116 ka] 
focal-u dates/main amd64 c -n-f Metadata 7756 B
Updating Ubuntu

So, what’s the big deal? This is where it gets quite interesting and one simple example is the windows interoperability with Linux – allowing one to run linux commands from within a command prompt.

Mixing Linux and Windows commands

Getting list of users from Microsoft Teams

I recently needed to get a list of users that belong to a specific Microsoft Teams team – and there isnt anything out of the box to get this using the Teams app. AFAIK, the only way to do this is using the Microsoft graph API – for which there are a few options.

For something quick (e.g. getting a list of users in a team), using the Graph explorer could be easy enough. On the other hand, if you need something more robust, you should program against the (REST) API.

Graph Explorer

Navigate to Graph explorer, sign in and authenticate yourself against the specific O365 tenant you are interested in – most folks would only have one.

Microsoft Graph Explorer

Once authenticated, on the panel on the left you see several sample queries and scroll down until you see the Teams.

Teams sample queries

To get members of a specific team, you need to get the team ID for that Team. This is unique ID (GUID) and doesn’t change over the lifetime of the team. If you have this, then go ahead to the next section – Getting team members.

Getting a list of Teams and Team ID

On the query panel in Graph explorer, select the “my joined teams” and run the query. You will get a JSON back that contains the list of teams that you are a member of. The “id” element represents the Team ID which you would need for any team related API calls. For example, I am interested in this specific #AI team: “#Reinforcement Learning and Decision AI”.

Get team details

Getting team members

Once you have the Team ID (the unique GUID that each identifies each team), you can get the members of the team using that option on the left. As shown on the screenshot below, you do need to pass in the team ID to the REST API and this would be something like this (and don’t worry what I am showing below is a fictious GUID):

https://graph.microsoft.com/v1.0/groups/f3f9ad1f-beea-4026-9b86-dd3788404999/members 
Member details for a specific Microsoft Team team

Programmatically getting Microsoft Team details

If you want something more robust and repeatable, then using the API (via code) or PowerShell might be better. If you are programming, you will need to register an app – which can authenticate using the Identify platform. This of course is quite powerful, but at times for simple things might be a bit too much.

In my simple task to get users from Teams, I prefer the PowerShell option. To get this going first you need to install the MicrosoftTeam module. This can be done using the command below.

Install-Module -Name MicrosoftTeams

Depending on your configuration you might get a warning as shown below.

PowerShell module installation

Once the Teams PowerShell module is installed, you can run PowerShell scripts against Teams and achieve the same result. I have two scripts below showing the same steps as with the Graph Explorer above. One of these gets details of the teams that a user is a member of. And the second script is to get members of a selected team.

Using PowerShell to get Team Details

The PowerShell script below to get a Team details is below; you can also get it from GitHub. Before you run this, there are two variables that need to be set.

  • One, the path where you want the team details to be exported (this is a csv file).
  • Two, set the email that you will use. This needs to be the same one that you authenticated against.

You will be prompted to sign in to authentic and this should be an experience that most folks would be familiar with. Note, each time you run the script, you need to authenticate – and this is irrespective of say if you are already logged into Teams of Office 365.

Authenticating user against Office 365

Assuming you have authenticated successfully, you should see an output like the one shown below; and a csv file in the path you configured will be created. This file will always be overwritten – without any prompts (of course this is assuming no other process is open that has a lock on that file).

#Set these variables, to what makes sense in your situation. The email here is the one that is the one connected to your teams account.
$exportLocation = "C:\temp\team-details.csv"
$emailAddress = "your-email@shouldbeputhere.com"

#Authenticate against teams
Connect-MicrosoftTeams

#Patience
Write-Host -ForegroundColor Blue "Successfully connected to Teams"
Write-Host -ForegroundColor Blue "Getting all team details for user: $($emailAddress)"
Write-Host -ForegroundColor Blue "Please be patient, if there are a lot of teams, this can take a while..."

# Get all of the team Groups IDs
# $GetUsersTeams = (Get-Team).GroupID
$GetUsersTeams = Get-Team -User $emailAddress

$Report = @()

# Will hold a basic count of user types and teams
$unavailableTeamCount = 0

# Loop through all teams that the user belongs to
$currentIndex = 1

ForEach($thisTeam in $GetUsersTeams) {
	# Show some output to the user
    Write-Progress -Id 0 -Activity "Building report from Microsoft Teams" -Status "$currentIndex of $($GetUsersTeams.Count)" -PercentComplete (($currentIndex / $GetUsersTeams.Count) * 100)
	
    # Attempt to get team details, throw error message if no access
    try {
        # Get team members
        #$users = Get-TeamUser -GroupId $thisTeam.groupID

		# Create an object to hold all values
        $teamReportObject = New-Object PSObject -Property @{
                GroupID = $thisTeam.GroupID
				TeamName = $thisTeam.DisplayName
                Description = $thisTeam.Description
                Archived = $thisTeam.Archived
                Visibility = $thisTeam.Visibility
				eMail = $thisTeam.MailNickName
            }

            # Add to the report
            $Report += $teamReportObject
       
    } catch [Microsoft.TeamsCmdlets.PowerShell.Custom.ErrorHandling.ApiException] {
        Write-Host -ForegroundColor Yellow "No access to $($team.DisplayName) team, cannot generate report"
        $unavailableTeamCount++
    }
	
	$currentIndex++
}
Write-Progress -Id 0 -Activity " " -Status " " -Completed

# Disconnect from teams
Disconnect-MicrosoftTeams

# Provide some nice output
Write-Host -ForegroundColor Green "============================================================"
Write-Host -ForegroundColor Green "                Microsoft Teams User Report                 "
Write-Host -ForegroundColor Green ""
Write-Host -ForegroundColor Green "  Count of All Teams - $($GetUsersTeams.Count)                "
Write-Host -ForegroundColor Green "  Count of Inaccesible Teams - $($unavailableTeamCount)         "
Write-Host -ForegroundColor Green ""

$Report | Export-CSV $exportLocation -NoTypeInformation -Force
Write-Host -ForegroundColor Blue "Exported report to $($exportLocation)"

Getting Team members using PowerShell

Now that you have the Team ID you are interested, you can run the other PowerShell script (also available on GitHub) to get a list of all the users in a specific team. Like the previous script, you would need set a couple of variables in the script:

  • The Team ID for the team you are interested in.
  • Path for the csv file with details to be saved.

Once you have authenticated and ran the script, the output will look like the one shown below. You get a summary of the team details, and details of the Teams users and their type (owner, member, or guest). And just like earlier, the file will be overwritten without a prompt, assuming no locks on it.

Members of a Microsoft Team
#Global variables to set:
#path of the file where to export
#specific ID of the team that you want the users for. 
$exportLocation = "C:\temp\RL-decision-AI-export.csv"
$TEAM_ID = "f3f9ad1f-beea-4026-9b86-dd3788404999"
            
$Report = @()

# counters
$ownerCount = 0
$memberCount = 0
$guestCount = 0

#connect to teams
Connect-MicrosoftTeams

$team = Get-Team -GroupId $TEAM_ID

#Patience, supposed to be a virtue
Write-Host -ForegroundColor Blue "Successfully connected to Team: $($team.DisplayName)"
Write-Host -ForegroundColor Blue "Getting all users in the team"
Write-Host -ForegroundColor Blue "Please be patient, if there are a lot of users, this can take a while..."

# Attempt to get team users, throw error message if no access
try {
	# Get team members
	$users = Get-TeamUser -GroupId $team.groupID

	# Loop through and get all the users
	$currentIndex = 1
	
	# foreach user create a line in the report
	ForEach($user in $users) {
		# Show some output to the user
		Write-Progress -Id 0 -Activity "Generating user report from Teams" -Status "$currentIndex of $($users.Count)" -PercentComplete (($currentIndex / $users.Count) * 100)
	
		# Maintain a count of user types
		switch($user.Role) {
			"owner" { $ownerCount++ }
			"member" { $memberCount++ }
			"guest" { $guestCount++ }
		}

		# Create an object to hold all values
		$ReportObject = New-Object PSObject -Property @{
			User = $user.Name
			Email = $user.User
			Role = $user.Role
		}

		# Add to the report
		$Report += $ReportObject
		
		$currentIndex++
	}
} 
catch [Microsoft.TeamsCmdlets.PowerShell.Custom.ErrorHandling.ApiException] {
	Write-Host -ForegroundColor Yellow "No access to $($team.DisplayName) team, cannot generate report"
}

#Complete progress
Write-Progress -Id 0 -Activity " " -Status " " -Completed

# Disconnect from teams
Disconnect-MicrosoftTeams

# Write out details for the user
Write-Host -ForegroundColor Green "============================================================"
Write-Host -ForegroundColor Green "                Microsoft Teams User Report                 "
Write-Host -ForegroundColor Green ""
Write-Host -ForegroundColor Green "Team Details:"
Write-Host -ForegroundColor Green "Name: $($team.DisplayName)"
Write-Host -ForegroundColor Green "Description: $($team.Description)"
Write-Host -ForegroundColor Green "Mail Nickname: $($team.MailNickName)"
Write-Host -ForegroundColor Green "Archived: $($team.Archived)"
Write-Host -ForegroundColor Green "Visiblity: $($team.Visibility)"
Write-Host -ForegroundColor Green ""
Write-Host -ForegroundColor Green "Team User Details:"
Write-Host -ForegroundColor Green "Owners - $($ownerCount)"
Write-Host -ForegroundColor Green "Members - $($memberCount)"
Write-Host -ForegroundColor Green "Guests - $($guestCount)"
Write-Host -ForegroundColor Green "============================================================"

$Report | Export-CSV $exportLocation -NoTypeInformation -Force
Write-Host -ForegroundColor Blue "Exported report to $($exportLocation)"

Of course, programming against the API is always more powerful, but sometimes quick and easy is what is needed. 🙂

Git and Code

I think this from xkcd sums up my afternoon quite nicely. Messed up a repo, and then was trying to ‘clean up’.

A huge thank you to Lily, on the team, for working with me to cleaning up my mess, and helping me show some of the ropes.

I know there are quite a few tutorials our there a couple of these that I found including one from Lily.

So go ahead, and setup a experiment repo and don’t be afraid to play and break things.

Maybe this needs to be updated to reflect Git, from REST. 🙂

Deleting Windows run history

If you have butter fingers like me, and over time end up with a lot of old commands with typos in your Windows run box that get annoying – deleting them is a simple. All you need to do it remove the following registry key.

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\RunMRU

Now every time one plays with regedit, it can be dangerous – you can also save this commend as a .cmd file, and then run it with admin privileges – essentially does the same thing.

reg delete "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\RunMRU" /f

You can also download the same thing from here.

%d bloggers like this: