Intro to Machine Learning using Transformers

A few months ago, I began to learn about machine learning and AI. Initially, I was expecting a steep learning curve with a lot of complex math. But after some exploring, I discovered a rich ecosystem of frameworks, libraries, and services which makes it relatively easy for a developer like me to use machine learning.

What is Machine Learning?

Machine learning is good at tasks which are easy for a person to do, but difficult to implement in code. For example, determining the species of animal in a photo or determining if a product review is negative or positive. So instead of writing code to do the task, in machine learning I write code to train the machine to do the task. For the former, the training process involves providing the machine with a set of animal photos and the name of the animal in each photo so it can learn about the unique features of each animal and subsequently identify it the next time it is given a different photo of the animal. Similarly, for the latter, the machine is provided with a set of product reviews and whether the review is negative or positive so it can learn what a negative or positive review sounds like and be able to label it as such the next time it is given a different review. In machine learning, the thing being trained to do a certain task is called the model and the data used to train it is called the dataset.

In this post, I will demonstrate how to train and use a model to classify reviews as positive or negative by training it on a Yelp reviews dataset using the Hugging Face Transformers framework. Along the way, I will discuss the relevant machine learning concepts.

Development Environment

The code for training models is written in Python. The execution of this code requires a lot of computations that would take too long to complete on CPUs. Instead, the code needs to be executed on GPUs. Notebooks provide a browser based code editing environment where I can write and execute my code on GPUs. Most notebook services offer free plans with access to limited compute and storage resources and paid plans if more resources are needed. For this demo, I use Google Colab notebooks.

First I need to install library dependencies in my notebook. transformers[sentencepiece] is the dev version of transformers which provides functions for training and using models. datasets provides functions for downloading and processing the data used for training and validating models. evaluate provides functions for calulating metrics when training models. accelerate is also required for training.


!pip install transformers[sentencepiece] datasets evaluate accelerate

Hugging Face provides a git like repository for saving and sharing models called Model Hub. To save the trained model from my notebook to Model Hub, the notebook needs to authenticate the Model Hub. It can do this with a user access token with write permissions which I create in the user settings section of the Hugging Face website and provide to the notebook by calling notebook_login.


from huggingface_hub import notebook_login

notebook_login()

Using Pre Trained Models

If a machine learning app was a person, the model would be its brain. From a more technical perspective, you can think of the model as a function that performs a specific task such as determining the sentiment, either positive or negative, of a given text or filling in the missing word in a sentence. Unlike in programming where a function contains code which implements the desired functionality, in machine learning, the function contains millions of numeric parameters called weights which are configured to do the desired task through training. During training, the model's weights are initialized to random values and used to make predictions on the training data. Then the weights are adjusted based on how far off the predictions are from their expected values. This cycle of making predictions and adjusting weights is repeated until the accuracy of the model reaches an acceptable value. Going back to our analogy of the model as a person's brain, training a baby to classify a product review written in English as positive or negative would be more difficult than training an adult that already understands the English language. The equivalent of the adult brain in machine learning is the pre trained model. This is a model that has already been trained to do a similar task in the target language. Since this pre trained model already understands the target language, it will be easier to train or fine tune it to do a new task.

There are a lot of pre trained models to choose from. Since we are dealing with text, we need to pick a language model. There are three categories of language models to choose from depending on your target task. Encoder models are good at understanding input text and commonly used for tasks like sentence classification. Decoder models are good at generating output text and commonly used for tasks like sentence completion. Encoder-Decoder models are good at doing both and commonly used for tasks such as summarizing a sentence.

Since I want my model to classify reviews written in English, I need to chose an encoder model that has been pre trained on English text that I can train, or fine tune to do this. One common option is the BERT model, which was pre trained in an un supervised manner, to fill in missing, or masked words in English sentences. Un supervised training means that the data the model was trained on was not labeled. For example, given a complete sentence, a random word is removed and then the model is used to predict the missing word in the sentence. I chose to use the distilbert-base-uncased model because it performs similarly to BERT, but is smaller and faster.


model_name = 'distilbert/distilbert-base-uncased'

The pipeline function is used to run the task the model was trained to do. Here I am using the distilbert-base-uncased model to fill in the masked word, specified using [MASK], in the sentence.


from transformers import pipeline

unmasker = pipeline(task='fill-mask', model=model_name)
unmasker("My favorite sport is [MASK].")

[{'score': 0.08831855654716492,
  'token': 5742,
  'token_str': 'swimming',
  'sequence': 'my favorite sport is swimming.'},
 {'score': 0.07627089321613312,
  'token': 3455,
  'token_str': 'basketball',
  'sequence': 'my favorite sport is basketball.'},
 {'score': 0.06569057703018188,
  'token': 2374,
  'token_str': 'football',
  'sequence': 'my favorite sport is football.'},
 {'score': 0.06273501366376877,
  'token': 21383,
  'token_str': 'archery',
  'sequence': 'my favorite sport is archery.'},
 {'score': 0.06088290363550186,
  'token': 4715,
  'token_str': 'soccer',
  'sequence': 'my favorite sport is soccer.'}]

Preparing Datasets

Models cannot process raw text input directly. First the text needs to be converted to a format the model understands using a tokenizer. Generally during tokenization, the text is split into a sequence of tokens and each token is mapped to its numeric id from a vocabulary created during pre training of the model. Often special tokens and metadata are added to provide additional information to the model, for example to denote the start or end of sentences. Since tokenization differs across models it is important that the tokenizer used when running or fine tuning a model matches the tokenizer used during pre training of the model. When using the model with the pipeline function, the function automatically determines the correct tokenizer to use and tokenizes the input text before passing it to the model. But when fine tuning the model, the text in the training data needs to be tokenized manually.

Datasets available for training models can be found on Hugging Face Hub. The load_dataset function is used to load the yelp_review_full dataset.


from datasets import load_dataset

original_datasets = load_dataset('yelp_review_full')
original_datasets

DatasetDict({
    train: Dataset({
        features: ['label', 'text'],
        num_rows: 650000
    })
    test: Dataset({
        features: ['label', 'text'],
        num_rows: 50000
    })
})

During training, the model should learn about the features which make a review negative or positive so that it can later label reviews it has not seen before. But if a model is trained for too long or with too small a dataset, the model may become too specific, or overfit to the data it was trained on. When this occurs, the model will only be able to accurately label reviews it was trained on, but not reviews it has not seen before. If the dataset used to train the model was also used to validate its accuracy, it would be difficult to determine if a high accuracy was due to a well trained model or an overfitted one. But when a different dataset is used to validate it, a well trained model's accuracy would still be high while an overfitted model's accuracy of would be low because it would fail to label the reviews which it has not seen before. Therefore, to detect overfitting, it is important that the training dataset always be different from the validation dataset. The yelp_review_full dataset is already split into a train dataset and test dataset which can be used for validating the model.


original_train_dataset = original_datasets['train']

original_train_dataset[1:3]

{'label': [1, 3],
 'text': ["Unfortunately, the frustration of being Dr. Goldberg's patient...",
  "Been going to Dr. Goldberg for over 10 years..."]}

original_train_dataset.features

{'label': ClassLabel(names=['1 star', '2 star', '3 stars', '4 stars', '5 stars'], id=None),
 'text': Value(dtype='string', id=None)}

Each item in the original dataset has a text field containing the text of a review and a label field containing the number of stars, from one to five, given to the subject of the review. But I want my model to label reviews as positive or negative, not by number of stars. So for training purposes, I will assume that a review is positive if it has three or more stars and negative otherwise. Since the model uses the labels of the data it was trained on, before using this dataset for training, I need to convert the labels from number of stars to positive or negative. I also need to tokenize the review text using the model's tokenizer. Both can be done using the map method of the dataset.

The map method's parameters are the function containing the mapping logic and the batched boolean keyword argument specifying whether mapping should be done in batches. When batched is true, the function is passed a dictionary where the keys are the fields of the dataset and values are batch sized list of the fields' values. The function also returns a dictionary. If a key in this dictionary matches a field in the dataset, the key's values updates the field's values. Otherwise, a new field is added to the dataset for the key and its values.

I define a function named label_text which uses a Python list comprehension to update labels greater than or equal to two, which is equivalent to three stars because label is zero based, to one for positive and everything else to zero for negative. The label key is used to reference the updated labels in the returned dictionary so the map method I pass the this function to knows to update the label field instead of adding a new field.


def label_text(dict):
  return {'label': [1 if label >= 2 else 0 for label in dict['label']]}

labeled_datasets = original_datasets.map(label_text, batched=True)
labeled_datasets

DatasetDict({
    train: Dataset({
        features: ['label', 'text'],
        num_rows: 650000
    })
    test: Dataset({
        features: ['label', 'text'],
        num_rows: 50000
    })
})

labeled_datasets['train'][1:3]

{'label': [0, 1],
 'text': ["Unfortunately, the frustration of being Dr. Goldberg's patient...",
  "Been going to Dr. Goldberg for over 10 years..."]}

To tokenize the text field in the datasets, I need to use the same tokenizer that was used during the pre training of the model. I can get it by passing the model name to the from_pretrained method of the AutoTokenizer class. I define a function named tokenize_text which uses the tokenizer to tokenize the text. The truncation boolean keyword argument of tokenizer is set to true so that it truncates any text that is longer than maximum length supported by the model. The tokenizer returns a dictionary containing the input_ids and attention_mask keys which are added to the datasets as new fields. The input_ids key references a list of lists containing the token ids of each text after it has been tokenized. During training, data will be passed to the model in batches. Since the model requires the lists of input_ids in each batch be of the same length, they are padded to the length of the longest one in the batch using a padding token id, which the model ignores. To let the model know which indexes in input_ids to pay attention to, the attention_mask key is used to reference a list of lists with the same dimensions as input_ids where each index contains a one if the corresponding index in input_ids contains a real token and a zero if its padding.


from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name)

def tokenize_text(dict):
  return tokenizer(dict['text'], truncation=True)

tokenized_datasets = labeled_datasets.map(tokenize_text, batched=True)
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['label', 'text', 'input_ids', 'attention_mask'],
        num_rows: 650000
    })
    test: Dataset({
        features: ['label', 'text', 'input_ids', 'attention_mask'],
        num_rows: 50000
    })
})

tokenized_datasets['train'][1:3]

{'label': [0, 1],
 'text': ["Unfortunately, the frustration...", "Been going to..."],
 'input_ids': [[101, 6854, 1010, 1996, ...], [101, 2042, 2183, 2000, ...]],
 'attention_mask': [[1, 1, 1, 1, ...], [1, 1, 1, 1, ...]]}

Training the Model

The tokenized datasets do not contain any padding yet. This is because the amount of padding for each item is dependent on the batch it belongs to and the batches are assembled later during training. The class responsible for assembling the batches and padding is named DataCollatorWithPadding. It is initialized with the tokenizer so it knows what token to use for padding.


from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

A model contains layers of weights. The layers near the beginning are trained to identify general features of the input, while the layers near the end, also known as the head, are trained to use those features to do a specific task on the input. For example, in a language model, the layers near the beginning might identify the parts of speech in a sentence, while the head uses this information to classify the sentence. To train a pre trained model to do a different task, I need to replace its current head with a new head that supports the new task. The weights of the new head are initialized to random values and will be optimized for the new task during training. Since I want to train the distilbert-base-uncased model to label reviews, also known as sequence classification, I need to get an instance of this model with a sequence classification head. I can do this by calling the from_pretrained method of the AutoModelForSequenceClassification class. The method parameters are the name of the pre trained model, the number of labels the model will use, and the mappings between the numeric label and its human readable name.


id2label = {0: 'negative', 1: 'positive'}
label2id = {'negative': 0, 'positive': 1}

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2, id2label=id2label, label2id=label2id)

During training, the dataloader creates batches from the training dataset using the data collator. The model makes its predictions on a batch. A measure of how far off the predictions are from their expected values, also known as the loss, is calculated. To determine how to change the weights to improve or lower the loss, the gradient of the loss is calculated. Then the weights are updated based on the gradient. This process is repeated until the model has seen all of the batches in the dataset, also known as an epoch. At the end of the epoch, performance metrics for the trained model are computed from the predictions it makes on the validation dataset. The Trainer is a high level class which implements these steps when provided with the model, the training and validation datasets, the tokenizer, the data collator, and a function defining how to compute the metrics.

To determine how good my trained model is at labeling reviews, I configure the Trainer to report its accuracy at the end of each epoch, which is the percentage of reviews the model labeled correctly from the validation dataset. To specify how to compute accuracy, I define a function named compute_metrics which I will provide to the Trainer. This function is passed a tuple containing the prediction and label for each review and is expected to return a dictionary containing the accuracy key and its computed value. The first element of the tuple contains the predictions as a list of logits, where a logit is a list containing the probability for each label. The second element of the tuple contains the actual labels. The Evaluate library provides modules for computing various metrics. To use it to compute accuracy, I load the module for accuracy by name and call its compute method with the predicted and actual labels as arguments. Since the compute method expects the predictions argument to be a list of labels, not logits, I have to map each logit to the index in the logit containing the highest probability. The compute method returns a dictionary containing the accuracy key and its computed value, which I return from the compute_metrics function.


import evaluate
import numpy as np

def compute_metrics(valid_set_preds):
    metric = evaluate.load('accuracy')
    logits, labels = valid_set_preds
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

The Trainer is configured using TrainingArguments. Here I configure it to compute metrics for the model and upload the model to Model Hub under my_model_name at the end of each epoch. By default, the model is trained for three epochs.


my_model_name = 'distilbert-base-uncased-finetuned-yelp'

from transformers import TrainingArguments

training_args = TrainingArguments(
    my_model_name, eval_strategy='epoch', save_strategy='epoch', push_to_hub=True
)

The Trainer is created using the arguments prepared earlier.


from transformers import Trainer

trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

To train the model, call the train method of the Trainer. Ideally, loss should decrease and accuracy should increase after every epoch.


trainer.train()

After training is complete, the final version of the model needs to be manually pushed to the Model Hub using the push_to_hub method of the Trainer.


trainer.push_to_hub()

Using the Model

The model's page on Model Hub contains an inference widget for trying out the model on some input.

To use the model in code, create a pipeline function using the model's fully qualified name and call it with the text to classify.


from transformers import pipeline

classifier = pipeline(task='text-classification', model='vinhanguyen/distilbert-base-uncased-finetuned-yelp')
classifier('The wonton soup is delicious!')

[{'label': 'positive', 'score': 0.9996329545974731}]

classifier('The fried rice is nasty.')

[{'label': 'negative', 'score': 0.971501886844635}]

Resources:

Developer's Guide to PostgreSQL Admin on Ubuntu

When developing an app which uses a database, I prefer to have a local database instance to develop against instead of using a shared database on the network or provisioning one in the cloud. This allows me to make changes to the database without interfering with others. But this setup does require some basic database administration knowledge. At a minimum, you should know how to install and setup the database server and how to create and remove users and databases. In this post, I will show you how to do this for PostgreSQL on Ubuntu Linux.

Install

PostgreSQL can be installed using the package manager.


sudo apt install postgresql

This command does the following:

  1. Installs the PostgreSQL server (postgres)
  2. Installs PostgreSQL client apps (psql, createuser, createdb, etc.)
  3. Creates a PostgreSQL user named postgres (with superuser status)
  4. Creates a PostgreSQL database named postgres
  5. Creates a Linux user named postgres

Setup

For local connections, PostgreSQL client apps default to connect using the user and database names matching the username of the Linux user that executed the app.

To get a PostgreSQL interactive terminal for executing SQL queries as a superuser, use sudo -u to execute psql as the postgres Linux user.


sudo -u postgres psql

To do this without sudo, create a PostgreSQL user with superuser status and a database, both with the same name as your Linux user.

Using SQL:


sudo -u postgres psql

CREATE USER username SUPERUSER;
CREATE DATABASE username OWNER username;
\q
  • username is your Linux username
  • \q quits psql

Using client apps:


sudo -u postgres createuser -s $USER
createdb -O $USER $USER
  • -s gives the user superuser status
  • $USER is an environment variable which holds your Linux username
  • -O specifies the owner of the database

Now you can use psql and the other client apps without sudo.


psql

Create a Database and User

It is common to create a separate database and user for each project.

Using SQL:


psql

CREATE USER username PASSWORD 'password';
CREATE DATABASE dbname OWNER username;
\q
  • username is the database user
  • password is the database user's password
  • dbname is the database name

Using client apps:


createuser username -P
createdb dbname -O username
  • username is the database user
  • -P causes createuser to prompt for the database user's password
  • dbname is the database name
  • -O specifies the owner of the database

Remove a Database and User

Using SQL:


psql

DROP DATABASE dbname;
DROP USER username;
\q
  • dbname is the database name
  • username is the database user

Using client apps:


dropdb dbname
dropuser username
  • dbname is the database name
  • username is the database user

Access a Database

Using psql:


psql -h localhost -U username -W dbname
  • -h specifies the hostname of the PostgreSQL server (localhost)
  • -U username specifies the database user to connect as
  • -W causes psql to prompt for the database user's password
  • dbname is the name of the database to connect to

Using Node.js:


import { Client } from 'pg';

async function hello() {
  const client = new Client({
    host: 'localhost',
    database: 'dbname',
    user: 'username',
    password: 'password',
  });
  await client.connect();

  try {
    const res = await client.query(`SELECT 'Hello world!' as message`);
    console.log(res.rows[0].message);
  } catch (e) {
    console.log(e);
  } finally {
    await client.end();
  }
}

hello();

  • dbname is the database name
  • username is the database user
  • password is the database user's password

Building a Tennis Scorekeeper App

Background

When watching my nephew’s junior tennis matches, I often forget what the score is. After looking at existing tennis scorekeeper apps and not seeing any with the features I wanted, I decided to build my own. In this post, I will talk about the app I built and the design decisions behind it.

Most junior tennis events I've watched play 1 set matches where the first player to win 4 games with tiebreak at 3-3 or 6 games with tiebreak at 5-5 is the winner. The players spin a racket at the start of the match to determine who serves first. At an event, a player usually plays 3 or 4 sets, each against a different opponent. This differs from pro tennis where matches are best of 3 sets (or 5 at grand slams) where each set is first to 6 games with tiebreak at 6-6.

Design

To support both junior and pro style scoring in my app, the user is given the option to play a tiebreak at the start of each game and can start a new set anytime. To support 1 set matches, the user is allowed to change the player serving first at the start of each set, which defaults to the receiver from the previous game. Since the user has more options, it is possible to make mistakes. For example, starting a new set at 4-0 when playing a 6 game set or giving a point to the wrong player. To handle these cases, I included an undo function. To persist the match state across page reloads and screen locks, all updates to the match state are saved to local storage.

Implementation

I chose to make the app a client side web app to leverage the browser’s built in IndexedDB storage for persisting match state. I used React to build the app.

Demo!

The app is available here.

Implementing OpenID Connect in React

Building authentication for a web app from scratch is not a trivial task. A good implementation may require support for complex authentication methods, account management functionality, and secure storage of user credentials. If the app calls a REST API for backend services, a secure mechanism for passing the identity of the user in requests to the API is also required. OpenID Connect Implicit Flow solves these challenges by delegating authentication to a third party and using tokens to encode and transport user identity. In this post, I will explain how it works and demonstrate how to implement it in a React app.

How it Works

In OpenID Connect Implicit Flow, 1) the app requests that the user authenticate by redirecting the user to a trusted third party to authenticate called the Auth Provider. 2) The user can authenticate by any method supported by the Auth Provider. 3) If successful, the Auth Provider redirects the user back to the app with a signed id_token encoding a set of claims about the user's identity such as the user's email. 4) The app includes this id_token in requests to the REST API for authentication. 5) When the REST API receives the request, it verifies the id_token's signature using the Auth Provider's public key and looks at the id_token's claims to determine the user associated with the request.

The location at the Auth Provider where the Webapp sends the auth request (step 1) is called the authorization_endpoint. Each auth request includes the client_id, which the Auth Provider uses to determine the source of the auth request. The location at the app where the Auth Provider sends the id_token (step 3) is called the redirect_uri. The authorization_endpoint can be found in the Auth Provider's Discovery Document. During registration with the Auth Provider, the app owner sets the redirect_uri and the app is assigned its client_id.

Implementation

In the React app, the oidc module contains functions for sending the auth request (sendAuthReq) and handling the id_token from the auth response (handleAuthResp). It uses the authorization_endpoint, client_id, redirect_uri defined in oidc config module. The TokenProvider component manages the token state and provides it to its child components, which are rendered by the RouterProvider according to the routes defined in its assigned router. The index component contains a login button which triggers authentication by calling sendAuthReq from its click handler. The callback component, which is mapped to the redirect_uri, calls handleAuthResp to get the id_token and setToken to update the token state. The Claims component gets the token and displays its claims about the identity of the authenticated user.

OIDC module


export const config = {
  authorization_endpoint: 'REPLACE_WITH_AUTHORIZATION_ENDPOINT',
  client_id: 'REPLACE_WITH_CLIENT_ID',
  redirect_uri: 'REPLACE_WITH_REDIRECT_URI',
};
src/auth/oidc.config.ts

import { decodeJwt } from "jose";
import { nanoid } from "nanoid";
import { config } from "./oidc.config";

export function sendAuthReq() {
  const {pathname} = new URL(window.location.href);
  localStorage.setItem('location', pathname);

  const nonce = nanoid();
  localStorage.setItem('nonce', nonce);

  const {authorization_endpoint, client_id, redirect_uri} = config;
  
  const params = new URLSearchParams({
    client_id, 
    response_type: 'token id_token',
    scope: 'openid profile email',
    redirect_uri, 
    nonce
  }).toString();
  const {href} = new URL(`${authorization_endpoint}?${params}`);
  
  window.location.href = href;
}

export function handleAuthResp() {
  const {hash} = new URL(window.location.href);
  if (!hash) {
    throw new Error('No fragment');
  }
  const fragment = new URLSearchParams(hash.substring(1));

  const id_token = fragment.get('id_token');
  if (!id_token) {
    throw new Error('No id_token');
  }
  
  const {nonce: received} = decodeJwt(id_token);
  const sent = localStorage.getItem('nonce');
  if (sent !== received) {
    throw new Error('Nonce mismatch');
  }

  return id_token;
}
src/auth/oidc.ts

Since sendAuthReq can be called from anywhere in the app, it saves the current location to localStorage for the Callback component to retrieve and navigate to after authentication. To protect against replay attacks, a nonce is included in the request, which the auth provider returns in the response as a claim in the id_token. The nonce sent is saved to localStorage for handleAuthResp to retrieve and verify that it matches the nonce received. Since the Auth Provider may support multiple authentication flows, the response_type is set to 'token id_token' to specify that this is a request for Implicit Flow. To request claims about the user's profile and email, the scope is set to 'openid profile email'. In handleAuthResp, the id_token is retrieved from the hash property of the redirect_uri.

TokenProvider


import { decodeJwt } from "jose";
import { createContext, useContext, useEffect, useState } from "react";
import { sendAuthReq } from "./oidc";

export const TokenContext = createContext<null|string>(null);
export const SetTokenContext = createContext((token: null|string) => {});

export default function TokenProvider({children}: any) {
  const [token, setToken] = useState<null|string>(null);

  // restore token from localStorage
  useEffect(() => {
    const stored = localStorage.getItem('token');
    if (stored) {
      const {exp} = decodeJwt(stored);
      if (exp && exp*1000 > Date.now()) {
        setToken(stored);
      }
    }
  }, []);

  // sync token with localStorage
  useEffect(() => {
    if (token) {
      localStorage.setItem('token', token);
    } else {
      localStorage.removeItem('token');
    }
  }, [token]);

  // schedule auth when token expires
  useEffect(() => {
    if (token) {
      const {exp} = decodeJwt(token);
      if (exp) {
        const expiresIn = exp*1000-Date.now();
        const id = setTimeout(sendAuthReq, expiresIn);
        return () => clearTimeout(id);
      }
    }
  }, [token]);

  return (
    <TokenContext.Provider value={token}>
      <SetTokenContext.Provider value={setToken}>
        {children}
      </SetTokenContext.Provider>
    </TokenContext.Provider>
  );
}

export function useToken() {
  return useContext(TokenContext);
}

export function useSetToken() {
  return useContext(SetTokenContext);
}
src/auth/TokenProvider.tsx

The token state and its setter are provided to TokenProvider's children using Context. The useToken and useSetToken custom Hooks are created to encapsulate the use of Context and simplify access to this state. To prevent the user from having to re-authenticate after reloading the app, the first two Effects sync the token state with localStorage. The third Effect schedules re-authentication when the token expires.


import React from 'react';
import ReactDOM from 'react-dom/client';
import { RouterProvider } from 'react-router-dom';
import TokenProvider from './auth/TokenProvider';
import './index.css';
import { router } from './router';

const root = ReactDOM.createRoot(
  document.getElementById('root') as HTMLElement
);
root.render(
  <React.StrictMode>
    <TokenProvider>
      <RouterProvider router={router} />
    </TokenProvider>
  </React.StrictMode>
);
src/index.tsx

The TokenProvider wraps the RouterProvider to make its Context available to the components rendered by the RouterProvider.

Routing


import { createBrowserRouter } from "react-router-dom";
import App from "./App";
import Callback, { loader as callbackLoader } from "./auth/Callback";
import Claims from "./Claims";
import Root from "./Root";

export const router = createBrowserRouter([
  {
    path: '/',
    element: <Root />,
    children: [
      {
        index: true,
        element: <App />
      },
      {
        path: 'callback',
        element: <Callback />,
        loader: callbackLoader
      },
      {
        path: 'claims',
        element: <Claims />
      }
    ]
  }
]);
src/router.tsx

The Callback component handles the auth response so its path must match the redirect_uri.


import { Outlet } from "react-router-dom";
import Nav from "./Nav";

export default function Root() {
  return (
    <>
      <Nav />
      <Outlet />
    </>
  );
}
src/Root.tsx

import { NavLink } from "react-router-dom";
import styles from './Nav.module.css';

export default function Nav() {
  return (
    <ul className={styles.nav}>
      <li>
        <NavLink 
          to={'/'} 
          className={({isActive}) => isActive ? styles.active : ''}
        >
          App
        </NavLink>
      </li>
      <li>
        <NavLink 
          to={'/claims'} 
          className={({isActive}) => isActive ? styles.active : ''}
        >
          Claims
        </NavLink>
      </li>
    </ul>
  );
}
src/Nav.tsx

Login and Logout


import './App.css';
import { sendAuthReq } from './auth/oidc';
import { useSetToken, useToken } from './auth/TokenProvider';

export default function App() {
  const token = useToken();
  const setToken = useSetToken();

  function login() {
    sendAuthReq();
  }

  function logout() {
    setToken(null);
  }

  return (
    <div>
      {token ? (
        <button onClick={logout}>Logout</button>
      ) : (
        <button onClick={login}>Login</button>
      )}
    </div>
  );
}
src/App.tsx

The user is logged in if the token is not null. To login, call sendAuthReq from the login button's click handler. To logout, set the token to null from the logout button's click handler.

Callback component


import { useEffect } from "react";
import { useLoaderData, useNavigate } from "react-router-dom";
import { handleAuthResp } from "./oidc";
import { useSetToken } from "./TokenProvider";

export default function Callback() {
  const {id_token}: any = useLoaderData();
  const setToken = useSetToken();
  const navigate = useNavigate();

  useEffect(() => {
    setToken(id_token);
    navigate(localStorage.getItem('location') || '/');
  }, [id_token]);

  return null;
}

export function loader() {
  const id_token = handleAuthResp();
  return {id_token};
}
src/auth/Callback.tsx

Because handleAuthResp contains side effects (window.location, localStorage), it has to be called from the Callback's loader instead of its render function. In the Effect, the token state is set to the id_token from the auth response and the app navigates to the location before authentication.

Claims component


import { decodeJwt } from "jose";
import { useToken } from "./auth/TokenProvider";

export default function Claims() {
  const token = useToken();

  let claims: any = {};
  
  if (token) {
    claims = decodeJwt(token);
  }

  return (
    <>
      {token ? (
        <>
          <table border={1}>
            <thead>
              <tr>
                <th>Claim</th>
                <th>Value</th>
              </tr>
            </thead>
            <tbody>
              {Object.entries(claims).map(([name, value]: any) => (
                <tr key={name}>
                  <td>{name}</td>
                  <td>{value}</td>
                </tr>
              ))}
            </tbody>
          </table>
        </>
      ) : (
        <p>No token</p>
      )}
    </>
  );
}
src/Claims.tsx

The useToken hook is used to get the token. The decodeJwt function from the jose module is used to get the token's claims.

Demo!

Redux: My Initial Thoughts

When building non trivial apps with more than a few components needing access to shared state, I like to use a state management framework to keep my state and state update logic in a central and standardized location ourside of any component. This allows my components to directly access state (see right) instead of having to use chains of input params and callbacks to read and update state (see left).

In Angular, my preferred state management framework is NgRx. When searching for a similar framework to use with React, I initially looked at React's built in Reducer, but it lacked a mechanism to handle side effects similar to Effects in NgRx, so I ended up choosing Redux. In this post I will share my initial thoughts regarding Redux from the perspective of a NgRx user after reading the official docs and refactoring my Timesheet app from my previous post to use Redux.

Setup

Project setup was very easy using the create-react-app with the redux-typescript template. It installed Redux Toolkit which provides some abstractions and helpers to make it easier to work with Redux. The included sample Counter feature was helpful as I used it as a reference for implementing my own features.

Reducers

Creating Reducers in Redux was similar to NgRx. Each feature has its own Reducer, referred to as a Slice in Redux, which encapsulates the state, actions, and state update logic for each feature of my app. I liked how Redux is able to infer the Actions and its payload from the method signatures of my Reducer functions so I do not have to manaually define them like in NgRx.

Side Effects

NgRx and Redux differs the most in how they implement side effects such as fetching data from the server. In NgRx, the Effect function containing the side effect logic and the Action which triggers it are separate entities, while in Redux, both concerns are combined into a single entity in the form of a Thunk function. In my opinion, NgRx's implementation is a bit more intuative because it allows me to think of everything in terms of Actions instead of Actions or Thunks, although the tradeoff for decoupling the Action and Effect in NgRx is writting a bit more code. I also think the learning curve for NgRx is higher because it uses RxJs to write Effects, although this might not be an issue for Angular developers.

Selectors

Selectors are implemented similarly in Redux and NgRx.

Final Thoughts

Learning Redux seemed daunting at first given the amount of documentation available on it, but I was able to get up speed pretty quickly by just going through the tutorial from official docs, perhaps because I've already used NgRx. Overall, I like the framework and will be using it in my future React projects.

My Journey to React and Initial Reactions

React is one of the most popular front end web frameworks and is something I've wanted to learn for a while now. I recently had some down time and decided to finally learn it. I started my journey to React about two months ago with the goal of learning core React and building a simple app to validate my knowledge. In this post, I will describe my journey and initial reactions from the perspective of an Angular developer.

My first task was to determine how to learn React. After some preliminary research, I discovered that there were two ways to build components in React; the class method or the functional method. The former is the older more established way of building React components. The latter method is relatively new, but appears to be the preferred method going forward. I chose to learn the functional method. There are not many books on building React apps using functional components to chose from. I picked Learning React from O’Reilly. The book started off good, but I kind of got lost about half way through. I decided to switch to the official React beta documentation, which turned out to be excellent! Even though it was beta, it appeared mostly complete. The challenges at the end of each section were also very helpful for applying the concepts in practice.

For my first React app, I chose to re implement a timecard app I built in Angular a few years ago, with a small enhancement. I originally built the app to help me track my time on various hourly jobs. The app allows me to punch in and out of jobs and has a cool feature which displays updated totals for time and money every second. This initial version used the browser’s localstorage for data storage. For the React version, I plan to use IndexedDB, which is a key/value style NoSQL database inside the browser.

Project setup was very easy with Create React App. This tool generated my project directory with a placeholder App component and start and build scripts. It is similar to running ng new in Angular CLI. I wanted to use TypeScript in my React project so I added the typescript template parameter when running the create-react-app command. I was able to open the project in my VS Code editor and use syntax highlighting and auto completion without additional plugins.

React feels simpler, less verbose, a bit more low level and less abstracted than Angular. There is no two way data binding, forms abstractions, or dependency injection. Each component is defined using a plain JavaScript function encapsulating state and event handlers and returns a view written in a HTML like syntax called JSX. The life cycle of a component is also very simple; whenever the state changes, the view re renders itself. Side effects like making HTTP requests or invoking browser APIs cannot occur during the rendering. They can only occur in event handlers or special functions called effects which can be configured to run on component load or when any subset of the component’s state changes. React does include built in support for reducers to centralize the management of application state and state changes, but I did not find them very useful because they lack support for triggering side effects in response to actions like in NgRx effects. React does have a very useful third party component library similar to Angular Material called MUI which I used to make the app’s UI look nice.

I finished building the app in about two weeks and deployed it using Azure's Static Website service. The app is live here.

I enjoyed building my first React app and am happy to add React to my front end development toolbox. I plan to use it in future projects.

Intro to Machine Learning using Transformers

A few months ago, I began to learn about machine learning and AI. Initially, I was expecting a steep learning curve with a lot of complex...