<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[thoughtsre]]></title><description><![CDATA[thoughtsre is the technical imprint of Binghao, serving as a nexus for deep-dive technical blogs and the code that powers them]]></description><link>https://blog.thoughtsre.com</link><image><url>https://substackcdn.com/image/fetch/$s_!wPFO!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2900c87-3d1b-4b1e-83ca-b8f9a13baef5_256x256.png</url><title>thoughtsre</title><link>https://blog.thoughtsre.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 05 May 2026 13:07:55 GMT</lastBuildDate><atom:link href="https://blog.thoughtsre.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Binghao Ng]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[thoughtsre@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[thoughtsre@substack.com]]></itunes:email><itunes:name><![CDATA[Binghao Ng]]></itunes:name></itunes:owner><itunes:author><![CDATA[Binghao Ng]]></itunes:author><googleplay:owner><![CDATA[thoughtsre@substack.com]]></googleplay:owner><googleplay:email><![CDATA[thoughtsre@substack.com]]></googleplay:email><googleplay:author><![CDATA[Binghao Ng]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Make working with git worktrees easier using direnv and sops]]></title><description><![CDATA[Stop copy-and-pasting env vars or setting static paths to secret files]]></description><link>https://blog.thoughtsre.com/p/make-working-with-git-worktrees-easier</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/make-working-with-git-worktrees-easier</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 21 Mar 2026 08:44:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!f9AD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f9AD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f9AD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f9AD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:737416,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/191646442?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f9AD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!f9AD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f6c2868-3c2e-42e3-999e-17723e7019a3_2752x1536.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated by Gemini Nano Bananas</figcaption></figure></div><p>I love working with Claude using <a href="https://git-scm.com/docs/git-worktree">git worktrees</a>. They allow me to work on multiple features concurrently on the same code base. In fact, worktrees are the recommended way of working on multiple things in parallel with code. But I quickly run into problems with persisting environment variables and allowing my code to find secrets, both of which I do not commit to git. My default response would have been to copy-and-paste the <code>.env</code> files into the worktree and to hard code a path to some secrets file stored separately (to wit you will start seeing ugly things like <code>Users/binghao/&#8230;/secrets.yaml</code> in the codebase. Yuck!). If I were feeling a bit fancier, I&#8217;d probably run a secrets vault or use a secrets manager on AWS for my secrets. But that&#8217;s just additional infrastructure I would have to run and can be a hassle for small hobby projects.</p><p>This is where Claude led me to <code>direnv</code> and <code>sops</code> which help me overcome both problems, respectively. In this post, I will talk about what the above two things are, how I use them and what are some of the gotchas. </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.thoughtsre.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h1><code>direnv</code></h1><h2>What it is</h2><p><code>direnv</code> is a tool that allows me to <a href="https://direnv.net">automatically load environment variables</a> whenever I enter a directory. It also <em>unloads</em> the environment variables when I leave the directory so that I don&#8217;t get stuck with stray environment variables when I go and do other stuff within the same shell.</p><h2>How I use it</h2><p>Once <code>direnv</code> is <a href="https://direnv.net/docs/installation.html">installed</a>, everything starts with a <code>.envrc</code> file in your repository root. In my case, I only have a very simple directive within the <code>.envrc</code> file, which is as follows,</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">dotenv_if_exists &lt;dir_of_dotenv_file&gt;/.env</code></pre></div><p>This states that whenever I enter the directory, if the dotenv file exists at the stated path, load it.</p><p>Then of course, I will have my dotenv files with my environment variables such as database ports, etc. listed in there.</p><p>Now each time I enter my git worktree, the environment variables will be automatically loaded.</p><h2>Gotchas</h2><ul><li><p>When you first enter a newly created worktree, <code>direnv</code> won&#8217;t be able to load the environment variables until you type <code>direnv allow</code> to explicitly allow the loading of the <code>.env</code> file.</p></li><li><p>In a deployment situation, you would still need to enable your CI/CD framework to find the environment variables and set it on deploy. The <code>direnv</code> documentation gives an example of how it can be done with <a href="https://direnv.net/docs/github-actions.html">GitHub Actions</a>.</p></li></ul><h2>Other thoughts</h2><ul><li><p>I really like the fact that the environment variables are loaded and unloaded <em>automatically</em> without me having to manually source certain files or add in code to load the <code>.env</code> file. It saves me the time and frustration needed to ensure that the environment variables are properly loaded with each new worktree.</p></li><li><p><code>direnv</code> includes <a href="https://direnv.net/man/direnv-stdlib.1.html">many other directives</a> that can enable much richer behaviour than what I&#8217;ve explained here.</p></li></ul><div><hr></div><h1><code>sops</code></h1><h2>What it is</h2><p><code>sops</code> (aka <strong>S</strong>ecrets <strong>OP</strong>eration<strong>S</strong>) is <a href="https://github.com/getsops/sops">tool that is used to edit encrypted files</a> (JSON, YAML, ENV, etc.). It supports encryption with &#8220;AWS KMS, GCP KMS, Azure Key Vault, HuaweiCloud KMS, age, and PGP&#8221;.</p><h2>How I use it</h2><p>Basically, <code>sops</code> allows me to store my secrets file in the git repo in a way such that the secret names are visible but the secrets themselves are encrypted. In this way, I can have quick reference of the secret key names while I&#8217;m working on the code but at the same time have the secret safe enough to commit to git. An example of the secrets YAML file that is encrypted by <code>sops</code> is shown below.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">aws:
    bearer_token_bedrock: ENC[AES256_GCM,data:gIibPeeTGYY4npuXKl7Ogjz9+r//mNzu5u0GvAZm9bghFgn24+XTqud3wyz+76OB2NRgkYRm3BruMDFdC4tTwTWeiCg0zQ5k5Tl+4ZG+/rZsoqs1Xr110F7ZQ10STq0LxxIrRpQSKqgaOZURiDJKXNBXwyzqJNdSzu8WRDis3wyqDDw6,iv:lycm17nnQAQDCmv2ePV7sF6oF/Tg0y9GMUjhNJWrMDs=,tag:C7oHCfV+DxfhcCsUnZ1YpA==,type:str]
    region: ENC[AES256_GCM,data:PDulQJpkP3Ir9nw3aA0=,iv:PMiFy9KGAU0ofW7ilF20iRD1zHS8baUZU1ki81rvOY0=,tag:opBsmKCbKfB8lVhl8RykCQ==,type:str]
    bedrock_model_id: ENC[AES256_GCM,data:u8dnWmGXO2yrWzYALTJx5a4Cq/WqUX3ALD6sEcZ2DDa2CZQOUeRIfWpLrUxr4uU=,iv:M9uJJTxooRo7lAuhmpPkp5of6QzuiNC3n0i3kfVRH4I=,tag:qRt/d1HLgoXMjGIMApm8bw==,type:str]</code></pre></div><p>As can be seen, I can still see how to reach the secrets as the variable <em>names</em> are still visible to me but the values are now encoded gibberish.</p><p>In my case, I use <code>age</code> to encrypt my secrets as it is a local development project. I start by creating a key file in the same location as my <code>.env</code> file, say <code>~/.config/projectyXYZ/age-key.txt</code>, by issuing the following command. This step also <strong>outputs a public key which needs to be noted down.</strong></p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">age-keygen -o ~/.config/projectyXYZ/age-key.txt
chmod 600 ~/.config/projectyXYZ/age-key.txt</code></pre></div><p>Then I set the environment variable <code>SOPS_AGE_KEY_FILE</code> in my <code>.env</code> file which gets loaded by <code>direnv</code> each time I enter the repository. This tells <code>sops</code> to use this key file to decrypt my secrets file.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">SOPS_AGE_KEY_FILE=~/.config/projectXYZ/age-key.txt</code></pre></div><p>In my repo root, I have another <code>.sops.yaml</code> file that looks like this.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;yaml&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-yaml">creation_rules:
  - path_regex: secrets/.*\.enc\.yaml$
    age: &lt;public key from key creation step&gt;</code></pre></div><p>This tells <code>sops</code> that any file with the path name according to the regex pattern should be decrypted with the <code>age</code> key using the <code>SOPS_AGE_KEY_FILE</code> and the public key.</p><p>To edit your secret values, you can&#8217;t just open the file and type. As mentioned, the values are encrypted. You need to use <code>sops &lt;path_to_secrets_file&gt;.yaml</code>, which will use <code>sops</code> to decrypt the file and open in the default editor. Once you save and exit, the values will be decrypted again.</p><p>To simply see the secrets, type <code>sops -d</code> <code>&lt;path_to_secrets_file&gt;.yaml</code>.</p><p>I also have a script (courtesy of Claude), called <code>with-secrets</code>, that would take the secrets file, decrypt it, reformat the key names (replace &#8220;.&#8221; with &#8220;_&#8221; and all uppercase) and exports the keys and values as environment variables, shown below.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:null}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">#!/usr/bin/env bash
#
# Decrypt SOPS-encrypted secret files and exec a command with the
# decrypted values exported as environment variables.
#
# Usage: scripts/with-secrets &lt;secret-file&gt; [&lt;secret-file&gt;...] -- &lt;command&gt; [args...]
#
# Each YAML file is decrypted via `sops -d` (never touches disk) and
# flattened: nested keys become UPPER_SNAKE_CASE env vars.
#   aws.access_key_id  -&gt;  AWS_ACCESS_KEY_ID
#   aws.region         -&gt;  AWS_REGION
#
# Requirements: sops, yq, age key configured via SOPS_AGE_KEY_FILE

set -euo pipefail

die() { echo "error: $*" &gt;&amp;2; exit 1; }

# --- Preflight checks -------------------------------------------------------

command -v sops &gt;/dev/null 2&gt;&amp;1 || die "sops is not installed (brew install sops)"
command -v yq   &gt;/dev/null 2&gt;&amp;1 || die "yq is not installed (brew install yq)"

if [[ -z "${SOPS_AGE_KEY_FILE:-}" ]]; then
  die "SOPS_AGE_KEY_FILE is not set. Add it to your env vars."
fi

if [[ ! -f "$SOPS_AGE_KEY_FILE" ]]; then
  die "Age key file not found: $SOPS_AGE_KEY_FILE"
fi

# --- Parse arguments ---------------------------------------------------------

secret_files=()
while [[ $# -gt 0 ]]; do
  case "$1" in
    --)
      shift
      break
      ;;
    *)
      [[ -f "$1" ]] || die "Secret file not found: $1"
      secret_files+=("$1")
      shift
      ;;
  esac
done

[[ ${#secret_files[@]} -gt 0 ]] || die "No secret files specified"
[[ $# -gt 0 ]]                  || die "No command specified after --"

# --- Decrypt and export ------------------------------------------------------

for secret_file in "${secret_files[@]}"; do
  # Decrypt to stdout, flatten nested YAML to KEY=value lines.
  # yq outputs "section.key = value"; we convert dots to underscores
  # and uppercase everything.
  while IFS='=' read -r key value; do
    # Skip empty lines
    [[ -z "$key" ]] &amp;&amp; continue
    # Trim whitespace
    key="$(echo "$key" | xargs)"
    value="$(echo "$value" | xargs)"
    # Convert dots to underscores, uppercase
    env_name="${key//./_}"
    env_name="${env_name^^}"
    export "$env_name=$value"
  done &lt; &lt;(sops -d "$secret_file" | yq -r 'to_entries | .[] | .key as $section | .value | to_entries[] | "\($section).\(.key)=\(.value)"')
done

# --- Exec the command (replaces this shell process) --------------------------

exec "$@"</code></pre></div><p>I use it as such, <code>with-secrets secrets/secrets_file.yaml -- scripts/app_you_want_to_run</code>. The front half decrypts the secrets and exports the environment variables and the second part is taken as an argument and run in the last line of <code>with-secrets</code>.</p><h2>Gotchas</h2><ul><li><p>In my case, the age key file is the key to the secrets, if I lose it, then the secrets cannot be recovered. Although, if I do lose the key file, I should probably reinstantiate all the secrets anyway.</p></li><li><p>You can use external key management systems like AWS KMS, but that will be an external infrastructure you have to manage. And if you are indeed using something like AWS KMS, you might as well use it directly. For a local project though, <code>sops</code> works well.</p></li><li><p>There is no automatic rotation for the secrets values (as what AWS KMS does for you) as the secrets resides within a file, except that outsiders cannot view it.</p></li></ul><h2>Other thoughts</h2><ul><li><p>I really like the fact that I can see the structure of the key file while coding without worrying that the secrets will be leaked. For example, I can know that my AWS Bedrock token can be reached at <code>secrets[&#8220;aws&#8221;][&#8220;bearer_token_bedrock&#8221;]</code> by looking at the encoded secrets file.</p></li><li><p>I think it works will for a small-medium sized individual project as it provides the right level of security with minimal setup and infrastructure.</p></li></ul><div><hr></div><p>Both tools are of course much more flexible and powerful than what I give them credit for in this post. I hope you found this post useful and are encouraged to dive deeper on how <code>direnv</code> and <code>sops</code> can be integrated into your workflows to make your life easier.</p><p>Please feel free to post your comments/thoughts or any questions in the comments section of this blog post.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[/think before you /plan]]></title><description><![CDATA[I'm really enjoying the thinking partner mode in Claude Code]]></description><link>https://blog.thoughtsre.com/p/think-before-you-plan</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/think-before-you-plan</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 10 Mar 2026 01:00:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2E-c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2E-c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2E-c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2E-c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:577367,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/190169505?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2E-c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!2E-c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1aa2211c-be14-4d3f-a007-05f4df7d6bf4_2752x1536.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated with Gemini Nano Bananas.</figcaption></figure></div><p>AI coding has shifted the emphasis in software development from the <em>act of writing code</em> to <em>designing software applications</em>. This means that the focus is on clear headed thinking and well crafted application design. Not that good design wasn&#8217;t important before, it&#8217;s just that with the execution of coding now largely taken out of our hands by the likes of Claude, the center of gravity when it comes to developing a &#8220;good&#8221; application lies in the thinking. In the past, even if you had a great design, your application might still turn out bad due to less-than-desirable execution from a less-than-skilled developer.</p><p>Now when using Claude Code, you will often hear people say, &#8220;Always start with the Plan Mode&#8221;. I don&#8217;t disagree. But I find that there is a <code>/think</code> step before the <code>/plan</code> step that helps me to understand my work a lot more and prevents me from giving up ownership of my project (ever so slightly) to Claude.</p><p>The thing is, Claude&#8217;s <code>/plan</code> Mode, assumes that there is a task at hand and once you plan, the next step is to execute. But sometimes you don&#8217;t want to do that. Sometimes you want to ruminate a bit. Sometimes you want to question whether what you are doing is right. Or just explore multiple possibilities.</p><p>In Claude&#8217;s <code>/plan</code> mode, after Claude asks you the questions it wants to ask, it just want to proceed to the execute phase. Even if you are not in <code>/plan</code> Mode, the default mode sometimes is still eager to <em>do</em> things. </p><p>This is why I have a <code>/think</code> skill (see below) in my <code>.claude/skills</code> folder, which I use mainly in three scenarios.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;7f032dce-6fec-406a-a7e9-d3c50980a80f&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown">---
model: opus
---

# Thinking Partner Mode

You are now in thinking partner mode. This is a discussion session &#8212; optionally grounded in the actual codebase.

## Rules

- You MAY use read-only tools: Read, Glob, Grep, and Task (Explore agents only)
- DO NOT use write or execute tools (Edit, Write, Bash, NotebookEdit, etc.)
- DO NOT write or modify any code
- Explore relevant code when it would inform the discussion

## What to do

- Ask clarifying questions to understand the problem
- Offer frameworks, mental models, and structured thinking
- Challenge assumptions and explore tradeoffs
- When the discussion involves code, read relevant files to ground your thinking
- Help organize and refine ideas
- Suggest options without committing to implementation

## Exiting this mode

When the user says "let's implement", "now code", "start coding", or similar - acknowledge the mode switch and begin using tools as needed.</code></pre></div><div><hr></div><h2>Scenario 1: Greenfield project ideation</h2><p>This is the most obvious scenario. You have an idea that&#8217;s bubbling in your head and you want to make it real. You create a new repo and enter <code>/think</code> mode and just start talking and sculpting out your idea.</p><p>The thing is, this mode won&#8217;t keep asking you to create a <code>CLAUDE.md</code> or ask you whether it can edit files, etc.. It will just talk and reflect on your conversation and push back on your ideas to make you think twice or suggest ways of doing things that you haven&#8217;t thought of before.</p><p>Personally, this works very well for me as I&#8217;m the type who likes to &#8220;talk&#8221; out my thinking.</p><p><em><strong>What happens after </strong></em><code>/think</code><em><strong>-ing:</strong></em><strong> </strong>I&#8217;ll ask Claude to summarise our conversation and update <code>CLAUDE.md</code> or the documentation in the <code>docs</code> folder. Usually at the start of a greenfield project, I&#8217;ll just put everything in <code>CLAUDE.md</code> until it reaches around 500 lines or when I&#8217;m done and ready to start implementing, then I&#8217;ll split the information into separate documents in the <code>docs</code> folder.</p><h2>Scenario 2: Speccing MVPs or Milestones</h2><p>When I have the overall idea of the project sculpted out, I break it down into MVP stages or milestones. These are big chunks of work that bring the idea through various stages of maturity. I don&#8217;t do all the MVPs that take the project to finish all at once. I do them one stage a time. Giving myself time to reflect and improve after each stage.</p><p>In this scenario, Claude is armed with the high-level thinking from Scenario 1. <code>/think</code> mode then helps me identify parts of the MVP specs that are not in line with the overall design or whether I&#8217;m being too ambitious (or not ambitious enough). </p><p><em><strong>What happens after </strong></em><code>/think</code><em><strong>-ing:</strong></em> I&#8217;ll ask Claude to create a GitHub issue that will contain the  MVP specs and act as the parent issue to the later development tasks. Claude will also figure out the key phases of the MVP development. After which, I can continue to refine the MVP specs or ask Claude to divvy up the work into GitHub sub-issues.</p><h2>Scenario 3: Understanding and refining development tasks</h2><p>Once I have the fine-grained GitHub issues that specify individual development tasks, I still use <code>/think</code> to ask questions about the task and test my understanding of why certain things are done.</p><p>In this way, I don&#8217;t entirely lose control of the project (at least I don&#8217;t <em>feel</em> like I lose control) and it helps me understand the code that was written when I review it.</p><p><em><strong>What happens after </strong></em><code>/think</code><em><strong>-ing: </strong></em>I proceed with <code>/plan</code> mode as recommended and implementation.</p><div><hr></div><p>I know it seems like the things I do with <code>/think</code> mode seems like what can be done with the default Claude mode. But as I mentioned, in <code>/think</code> mode, Claude is not so eager to get to the <em>doing</em>. Also, as can be seen in the skill markdown, <code>/think</code> mode can be embedded with a personality that helps with ideation.</p><p>At first, I constructed the <code>/think</code> mode because I got irritated with Claude constantly asking me whether it can start implementing things. But over time, I found myself naturally starting in <code>/think</code> before I do anything. I really enjoy the way the <code>/think</code> mode pushes my thinking, challenges my assumption and allows me to fill up more details in my project design before batting down execution plans. In fact, even when I do have the task specifications, I find myself chatting for a little bit in <code>/think</code> before starting so that Claude can have slightly more context beyond the immediate task or make some last minute adjustments before starting.</p><p><code>/think</code> &#8212;&gt; <code>/plan</code> &#8212;&gt; <code>/implement</code>&#8230;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Thoughts on AI at the end of 2025]]></title><description><![CDATA[At this moment, the AI revolution feels more like Google Maps than the invention of the wheel]]></description><link>https://blog.thoughtsre.com/p/thoughts-on-ai-at-the-end-of-2025</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/thoughts-on-ai-at-the-end-of-2025</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Thu, 25 Dec 2025 09:18:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vah6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vah6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vah6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!vah6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!vah6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!vah6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vah6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png" width="1456" height="618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5669066,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/182549084?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vah6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 424w, https://substackcdn.com/image/fetch/$s_!vah6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 848w, https://substackcdn.com/image/fetch/$s_!vah6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!vah6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a7f4f2-aaef-4919-8f32-15a8ca114a45_3168x1344.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated by Gemini + Nano Bananas</figcaption></figure></div><p>Let me start by saying that I am using the term &#8220;AI&#8221; like most people would in 2025. To more specific, AI in this article refers to a mash-up of Large Language Models (LLMs), generative AI (GenAI) not just for language but also for images, videos and audio as well as Agentic AI mediated by transformer-based LLMs. (Read the last one as &#8220;Agentic AI mediated by transformer-based LLMs&#8221; not just &#8220;Agentic AI&#8221; since the term &#8220;Agentic AI&#8221; can mean a lot more that what we are seeing). Let me also start with the end, which is that the AI revolution to me, at this moment, feels more like a &#8220;Google Maps moment&#8221; rather than an &#8220;invention of the wheel&#8221; event. </p><p>For the most part of 2025, I've been consciously using AI in my work and daily life. I&#8217;ve used Gemini for research for work and holiday research. Nano Bananas to generate images for my blogs. ChatGPT to construct a <a href="https://thoughtsre.substack.com/p/constructing-a-chatgpt-learning-companion?r=2idqpb">learning companion</a> for my daughter. NotebookLM to do <a href="https://thoughtsre.substack.com/p/agent-dexter?r=2idqpb">deep dives</a> into technical topics. Taking the agentic AI for research use case further, I&#8217;ve also designed a Claude Code agent to extract financial data to conduct market analysis. GitHub Copilot and Claude Code were my tools of choice which I used extensively for coding tasks, both personal and at work. I&#8217;ve even tried my hand at writing <a href="https://thoughtsre.substack.com/p/agent-dexter?r=2idqpb">my own AI agent</a>, just for kicks.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Throughout my experiments I was on a lookout for what makes this latest technology revolution special. I was interested in it because I wanted to find out what would be its impact on not only my career but also how it would impact my daughter&#8217;s future. </p><p>The first thing I noted is that AI, like any tool, needs to be learnt and practised with in order to derive benefit. I had to learn how to prompt it and how to release information to it step by step. At times, it felt like I was coaching an intern. This is also remarked by many others. Like any intern, they are also prone to missing details and steps. For example, my market analysis agent is able to performs most of the task but keeps missing minor steps here and there.</p><p>Next, I found AI a fantastic learning partner <em>on topics that I have some knowledge on</em>. I was trying to learn Active Inference and Variational Message Passing. Solely reading papers was not working very well for me as sifting through and deciphering information from various authors required holding a great deal of information in my head. I was able to use AI as a sparring partner and a knowledgeable tutor. It&#8217;s quite another story for things that I&#8217;m not too familiar with though. An example is the financial market analysis agent that I was trying to come up with. I was not able to ask questions about whether the analysis that AI was doing is correct. When AI came up with a suggestion, I can&#8217;t help but feel suspect.</p><p>Another related observation is that AI is great at boilerplate work, especially for coding work. AI saved me a lot of work in navigating the peculiarities of software libraries and frameworks. It got me to a starting point for a new software project really quickly, like setting up the project scaffold and writing tedious API specs and schemas. It was also tremendously helpful in helping me debug infrastructure issues like cloud configurations or deployment script errors. One thing that slightly caught me off guard is how much additional context engineering that is required to get consistent results from AI for software development.</p><p>However, when it came to work that required creative thinking or iterative experimentation such as data analysis, I found AI a little lacking. Not lacking in the sense that AI can&#8217;t generate outputs. But &#8220;lacking&#8221; in the sense that, 1) I don&#8217;t feel like I&#8217;m any wiser after the fact since AI did the thinking for me and 2) for tasks where I didn&#8217;t have any &#8220;mechanical issues&#8221; like for data analysis where I know the tooling and syntax quite well, I felt like I had to do a lot to get the results I would have gotten really quickly. In short, for tasks that the main obstacle was realising the idea I have in my head with tooling that I&#8217;m decently well-versed in, AI was not much help for me.</p><p>Lastly, the latest AI revolution has <a href="https://thoughtsre.substack.com/p/the-real-change-is-at-the-man-machine?r=2idqpb">changed the man-machine interface</a> and has automated a lot of work that was previously impossible, this has shifted the emphasis of value creation heavily to critical thinking. To paraphrase a big-data age saying, <em>garbage thinking, garbage output</em>. In the software development space, it means more effort is spent designing the infrastructure architecture, application functional modules and evolving the software development lifecycle to help product owners come up with task specifications faster and incorporating a more rigorous review process. AI doesn&#8217;t take away effort in thinking and learning fundamental concepts, it <em>emphasises</em> it since the effort of execution has been removed from you.</p><p>As I reflect on the points above, the closest historical analogy to the current AI that came to my mind was Google Maps. For those who remember the days pre-Google Maps, people had to call, ask around or fiddle around with paper maps, if they could get their hands on one. Even then, people got lost a lot and waited around for others a lot. Google Maps changed that. Nowadays, I can drop into any decently developed city and navigate my way around. Google Maps also enabled a whole slew of new economic activities like deliveries and small business advertising. It became my go-to tool for finding out where things are and how to get to places. It brought me a great deal of convenience and it would be hard to go back to a state where I don&#8217;t have Google Maps. However, it did not simply automate my life away or make the world come crashing down. It did however diminish my ability to navigate without preplanned routes and instructions.</p><p>Now, I&#8217;m not saying AI is the same as Google Maps (I said &#8220;closest analogy&#8221;). However I do believe that when the dust settles, people might think of 2025&#8217;s AI like how I think of Google Map in 2025. I know it doesn&#8217;t feel like it given the frenzy surrounding AI. But like what Howard Marks says, people often alternate between perceiving reality as being extremely hot or cold when it&#8217;s more often than not somewhere in between. The current AI is not fully autonomous or deterministic. Neither is it creative without humans. It is however very good at doing well-scoped tasks that were previously &#8220;un-automatable&#8221; due to pattern recognition requirements that used to be considered &#8220;human-level&#8221;. Lots of new economy will be built on top of this. Lots of old economy might be destroyed. It will in one shape or another become an indispensable part of everyone&#8217;s daily toolkit (though how long this will take is another important question, especially for the markets) and it will shape our behaviour. </p><p>But it does not make us less human. In fact, it emphasises the part of us that is uniquely human, our ability to think and create. Yes, there inherently a danger that AI in its current form, wielded by big Tech, might take away people&#8217;s ability to think for themselves, case in point, social media. That will be true. But it will also be true that it will force the conversation on the meaning of work and the meaning of creativity on society. And as always, we end up somewhere in the middle. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[A thought experiment on vibe-coding and the future of personal computing]]></title><description><![CDATA[Let me first state up front that this is not a post to make a case for or against vibe coding.]]></description><link>https://blog.thoughtsre.com/p/a-thought-experiment-on-vibe-coding</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/a-thought-experiment-on-vibe-coding</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 21 Oct 2025 01:00:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FHVr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FHVr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FHVr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FHVr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2183394,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/175925877?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FHVr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!FHVr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f71aaf-3bb4-4bd3-9899-7912c220b01c_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image generated by Gemini Nano Bananas. Prompt used: &#8220;Create a title image depicting a future where AI agents mediate between humans and software services that provide actual goods and services. Make the image comic like.&#8221;</figcaption></figure></div><p></p><p>Let me first state up front that this is not a post to make a case for or against vibe coding. In my mind, the cat is already out of the bag. Vibe-coding is here to stay, like it or not. The question is, where will things end up? In this post, I conduct a little thought experiment of a potential future world where vibe-coding continues to exist and what that might signify for personal computing. </p><p>As with all thought experiments, this post says nothing about the <em>plausibility</em> of such a world materialising within any given time horizon (although I think that the future imagined here is actually not that far away) and it will emphasis certain aspects of the dynamics while downplaying others. If you are still interested, read on&#8230;</p><h1>Let&#8217;s start with some definitions&#8230;</h1><p>I asked Gemini for a concise definition of vibe-coding and personal computing. Its responses are quite reasonable from my point of view. So here they are&#8230;</p><blockquote><p><strong>Personal computing</strong> is the use of a <strong>general-purpose computer</strong> designed for <strong>direct, individual control</strong> by one person. The devices, such as <strong>desktops, laptops, tablets, and smartphones</strong>, allow a single end-user to execute various tasks and applications without requiring a centralised operator or time-sharing with other users.</p><p>&#8212; Google Gemini 2.5 Flash</p></blockquote><blockquote><p><strong>Vibe coding</strong> is an AI-assisted software development approach where a user <strong>generates functional code</strong> by providing natural language prompts and high-level descriptions to a Large Language Model (LLM) or AI agent, often with <strong>minimal or no direct review of the generated code.</strong></p><p>&#8212; Google Gemini 2.5 Flash</p></blockquote><p>These are the definitions that I will be going with in this post. Note the bolded parts, we&#8217;ll come back to them later.</p><h1>The incompatibility of vibe-coding with the current software delivery model</h1><p>There has been a myriad of complaints about vibe-coding. Many of them revolve around the consistency and robustness of AI generated code. Some of them are true. Some are due to the injured pride of software developers. Some of them will be remediated as AI gets better. Some are due to the fundamental flaws of Large Language Models (LLMs) as a &#8220;next-token generation engine&#8221;.</p><p>However, if we look beyond the surface of the complaints, one of the key source of these complaints about robustness are due to our current model of software delivery, which takes two main forms: Software-as-a-Service (SaaS) and Software-as-a-Product (SaaP).</p><p>In SaaS, a software provider builds the software and users access the software via a common portal. An example is Gmail. In SaaP, a software provider builds teh software and delivers it to user devices. Examples include Adobe Photoshop or Microsoft Office. Of course, these are two end of a spectrum. There are a million shades of grey in-between, like how Google develops a Gmail app and it is being used on users&#8217; phones or how Microsoft allows for on-premise deployment of its SaaS products.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CjFx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CjFx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 424w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 848w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 1272w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CjFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png" width="1456" height="746" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:746,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:177622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/175925877?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CjFx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 424w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 848w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 1272w, https://substackcdn.com/image/fetch/$s_!CjFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3418e99d-64c2-4883-817c-492c3f85c4ff_2794x1431.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Either way, users interact <em>directly</em> with the software that is being built by the software provider. The users <em>conform</em> to the way the software provider has decided how the software functionality is to be delivered. Moreover, each software functionality is <em>confined</em> within the software product built.</p><p>In this way, the complaints around the lack of robustness of code generated by AI is well-justified (for now). After all, if the software provider vibe-coded and delivered the software with massive flaws to users, the blast-radius would be huge. Any security flaw, once exploited, could expose millions of users.</p><p>But what if the paradigm for software delivery changes?</p><h1>A potential future: Software-as-Tools and N=1 Software</h1><p>Let us now imagine that all user interactions with software is now intermediated by autonomous agents. After all, as I mentioned in my <a href="https://thoughtsre.substack.com/p/the-real-change-is-at-the-man-machine?r=2idqpb">previous post</a>, the real change that the GenAI revolution has brought about is a paradigm shift in the man-machine interface. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l1As!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l1As!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 424w, https://substackcdn.com/image/fetch/$s_!l1As!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 848w, https://substackcdn.com/image/fetch/$s_!l1As!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 1272w, https://substackcdn.com/image/fetch/$s_!l1As!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l1As!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png" width="1456" height="867" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:867,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:257801,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/175925877?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l1As!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 424w, https://substackcdn.com/image/fetch/$s_!l1As!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 848w, https://substackcdn.com/image/fetch/$s_!l1As!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 1272w, https://substackcdn.com/image/fetch/$s_!l1As!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1646196e-ca7a-47a1-b316-871bb0568e95_2591x1543.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Users will </strong><em><strong>all</strong></em><strong> vibe-code</strong>. They will simply tell the autonomous agents what they want, and the agents will flexibly select the <em>specific functionality</em> they require to meet their users&#8217; demands &#8212; meeting the definition of vibe-coding. Recall that currently, software functionality from a user&#8217;s perspective is &#8220;boxed&#8221; up in particular software applications. Yes, software providers have provided APIs for <em>other developers</em> to access fine-grained functionalities and compose those functions into other applications. But this fine-grained access is currently mostly limited to software developers &#8212; most of the world population aren&#8217;t software developers. If we look again at the definition of personal computing, we will see that being able to access fine-grained functions via API would not constitute &#8220;direct, individual control by one person&#8221; or personal computing.</p><p><strong>The agents will develop </strong><em><strong>N=1 software</strong></em><strong> to do what the users need </strong><em><strong>at that moment</strong></em>. The agents will take the fine-grained functionalities provided by software providers to construct dashboards, infographics, websites, micro-apps, etc. for the user on the spot. This presentation of information and performance of action will be <em>for the user only</em>, hence &#8220;N=1 software&#8221;. From this vantage point, the lack of robustness of software becomes less of a concern, since there is only 1 user. What we now think of as software prototypes/demos will in fact become the user application (if the word &#8220;application&#8221; still means anything) in the future.</p><p><strong>Software providers will no longer take care of end-to-end user journeys</strong>. Software providers will simply provide small piece-meal functions, which in today&#8217;s agentic AI parlance are called <em>tools</em>, to be consumed by autonomous agents. Software providers as we know them today will retreat into the background just like how IT networking has retreated into the background of web-application development. This is where software development practices will continue to live on. </p><h1>Requisite Conditions</h1><p>I know that even at the time of writing, we are already able to catch glimpses of the future proposed above. More and more software providers are providing Model Context Protocol (MCP) interfaces to be used by AI agents. However, from what I can see, a lot of these functionalities are provided by a small set of companies and to a small set of users (usually software developers or very tech savvy people). In order for the above scenario to materialise in the mainstream, I believe a set of additional conditions need to be satisfied.</p><h2>A trust mechanism for software</h2><p>In the new world described, one can no longer say, &#8220;XYZ company was at fault for not taking care of this or that thing that caused me to lose money/get scammed/not be able to get my stuff on time, etc.&#8221;. Everyone in fact becomes a producer of software of sorts. Everyone is like a chef (or rather everyone has a chef) that takes raw ingredients (tools) and makes a dish to the person&#8217;s liking (N=1 software). If something goes wrong, then do we know if it is the ingredient that is at fault? Or is the chef (AI agent) a bad chef?</p><p>One way to circumvent this problem is to have all software tools openly reviewed and scanned reviewed by an independent (or maybe decentralised) body and the results published on a decentralised ledger. The execution of the code should then reference a repository where the code version/provenance could be transparently reviewed. On the point of transparent execution of code, the reader might want to check out the <a href="https://www.unison-lang.org">Unison programming language</a> on content-addressed code.</p><h2>Smaller models capable of running on personal devices</h2><p>Going back to the analogy of AI agents being personal chefs, an underlying point about that is that the personal chef <em>knows</em> your personal likes/dislikes. In the case of N=1 software, it would be knowing what&#8217;s the best way to present information so that you can consume it easily, knowing your intellectual blind spots and taking the effort to help you cover them or understanding what you might be interested it given what your recent experiences. This cannot happen if everyone is using AI models that are designed for everyone else as well.</p><p>My personal take is that the current transformer-based large AI models will not be the paradigm that takes us into the age where AI is ubiquitous. There should be a different AI approach, perhaps something like the <a href="https://github.com/SamsungSAILMontreal/TinyRecursiveModels">Tiny Recursive Model</a> (TRM) combined with <em>modular AI</em> (see below).</p><h2>Modular AI</h2><p>Just like how software development went from monolithic applications to modularised plug-and-play libraries, the same could happen to AI. In the new world of personal computing, there is a need for access to knowledge (via large models, like how the internet has Wikipedia), personalisation (via small local models) and actuation (via tools). All three can be built and owned by different parties. All three are required for AI to fulfil its full promise for individuals, in my opinion.</p><p>For that to happen, the current way of running AI models has to change. AI models have to be reconfigurable and be able to &#8220;hot-swap&#8221; modules. Say I have an LLM that has general knowledge and the ability to &#8220;speak&#8221;. Now I want to inject embeddings that encodes my preferences or add in a new sensing modality like the ability to &#8220;see&#8221;, there should be a way to do it. Or it may even be the other way around where the LLM simply provides the embeddings and everything is orchestrated locally by my personal model.</p><p>Now, if any of the models change, for example the embedding size changes, how will the other models cope with it? Currently, the way this is overcome, by virtue of the advent of LLMs, is via natural text. Like how LLMs choose which tools to use. But natural language is a very low bandwidth way of communicating information and what happens if it is visual information? If the information to be conveyed cannot be conveyed as text, then the current way of communicating is via actual neural network layers that are run on the same chip (or in the same AI model).</p><p>To this, I do not have an answer yet. </p><h2>Decentralised ownership of AI models and storage of personal data</h2><p>Together with the point above, there will be a need to store the personalised models and data in a decentralised manner. This could perhaps again be achieved via a decentralised ledger, where everyone owns their own data which is used to train their own model (the personal &#8220;chef&#8221; or AI agent) and is only accessible by the individual.</p><p>Of course, there is no guarantee that such a decentralised ownership model will materialise. If we look at what is happening now, the large AI models, by virtue of their size, are owned by large corporations. Those same corporations also happen to be in control of our data. However, I cannot imagine this to be good if AI were to become ubiquitous and the power these companies will wield over society.</p><h2>Widespread support for provision of services via agents</h2><p>This goes without saying, for the &#8220;N=1 software&#8221; future to become reality, more service providers need to provide their services as tools. I&#8217;m not just talking about those names that you wouldn&#8217;t be surprised to see in Silicon Valley or tech startups claiming that they would revolutionise the world. I&#8217;m talking about everyday services like restaurants, plumbers, etc.. Because having to jump between the old (phones, websites, apps) and new way (AI agents) of doing things simply won&#8217;t lower the inertia of change that people have. This tells me that that future may not come as soon as tech billionaires want us to think.</p><h1>What does it mean? </h1><h2>General public</h2><p>For the general public, as I&#8217;ve said before, this means that the way humans interact with machines/software will complete change. Everyone will have their own personalised view of things (whether that is socially desirable is another question). On top of that, everyone will need &#8220;people&#8221; management/communication skills. The reason why I say this is because anyone who is using AI agents extensively now will realise that dealing with AI agents is very much like dealing with a person. And the better your communication skills are, the better you can get things done with AI agents.</p><h2>Software developers</h2><p>As for software developers, I actually don&#8217;t think that this will be the end of software development. Just like how many computing-related activities such as networking, sort of retreated into the background, so will software development as we currently know it. Software developers will still be required to develop <em>tools</em>, which are the fine-grained functionality that probably requires software development rigour. Of course, the skillset involved will be very different from today and it will also be AI-assisted.</p><p>There will be also be other areas that require software engineering that will grow, such as in decentralised ledgers (see later section).</p><h2>Service providers (Sellers)</h2><p>Service providers (or software providers) will also have to get used to a very different world. Think Uber which takes care of the end-to-end user journey of getting people from point A to point B. You book a ride, see when your ride is arriving, pay, lodge complaints, etc., all in one app. And that user journey is designed and engineered by one company with a particular philosophy or approach. </p><p>In the new world, service providers not only need to deal with the fact that there will no longer be a consistent representation of their services to the end-user, they also need to contend with the fact that there is an additional layer of mediation which is the AI agents between them and their users. This is true regardless of whether the AI agents are owned by large corporations or individuals. The result of this is that service providers will be &#8220;out of sight-out of mind&#8221; for their customers. Just like how when AliPay or WeChat became the default mode of payment in China, Chinese banks only saw 2 transactions each month from their customers: one when the salary came in and another when the salary got transferred to AliPay/WeChat. This means that they will no longer be able to influence their customers or lock them in like they can now. </p><h2>Decentralised ledgers / Blockchains</h2><p>With the advent of generative and Agentic AI, the need for proof of authenticity is never higher. And as mentioned before, decentralised ownership and storage of personal data might require a mechanism like a decentralised ledger. Maybe there are other ways to do this, but for now blockchains seem like a good candidate to store model certifications and ownership proofs.</p><h2>New economy of tool aggregators and AI trainers</h2><p>This new world of personal computing will also spawn new economies of tool aggregators and AI trainers. Just like how the internet age brought us AirBnB and Bookings.com, the new world will have its own aggregators for AI agents to discover new tools.</p><p>There will also be AI trainers to help people train their personal AI. Or maybe even AI trainers that train specialised AI agents that can be replicated infinitely to anyone who needs a specialised task done. For example, a 10x software developer in Rust could train an AI agent that codes as well as he/she does. Put it on the blockchain and market it via one of the tool aggregators.</p><h2>Taking into account radical uncertainty</h2><p>The impact of AI on personal computing expressed here is by no means exhaustive and certain (remember, this is only a thought experiment). To borrow the term <em>radical uncertainty</em> by Mervyn King, there are many unknown unknowns for which we cannot &#8220;price&#8221; in the effects that they might have.</p><h1>Conclusion</h1><p>This is a thought experiment on how the future of computing will look like in the age of AI. I do not purport that the world proposed is a plausible one, only a possible one. And for sure, there are certain elements that are far-fetched, certain elements oversimplified and certain elements that would require fundamental changes that are harder to come by than I can imagine. But I do believe that some aspects of this new world will materialise in the very near future and it is rather exciting to think about it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Agent Dexter]]></title><description><![CDATA[My little attempt at writing my own local AI agent]]></description><link>https://blog.thoughtsre.com/p/agent-dexter</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/agent-dexter</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 27 Sep 2025 17:10:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uu6M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uu6M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uu6M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 424w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 848w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 1272w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uu6M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png" width="1065" height="511" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:511,&quot;width&quot;:1065,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/174669596?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uu6M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 424w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 848w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 1272w, https://substackcdn.com/image/fetch/$s_!uu6M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F668a0622-6a2f-44bb-af8c-c0f6dbeb8e7e_1065x511.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I was inspired by <a href="https://gemini.google/overview/gemini-live/">Gemini Live</a>, which allows you to <em>speak</em> to an AI agent, and <a href="https://claude.com/product/claude-code">Claude Code</a>, where user interactions takes place via a terminal, and I thought to myself, &#8220;hmm&#8230; Shouldn&#8217;t be too hard to build something simple which can do both myself, right?&#8221; As you can probably guess by now, the answer is &#8220;wrong!&#8221; But you can also guess that I was foolhardy and naive enough to not realise this and to go along with the idea of building my own AI assistant, <strong>Dexter</strong>. Otherwise I would not be writing this post. So here goes&#8230;</p><h1>Before we go further&#8230;</h1><p>In this post, I will be going through what Dexter can and cannot do from a user&#8217;s perspective as well as some of the key learnings and challenges encountered while writing Dexter.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>For more technical details</strong>, see the code repo <a href="https://gitlab.com/thoughtsre/dexter-agents">here</a>.</p><p><strong>Regarding the use of AI</strong>, this post is written entirely &#8220;by hand&#8221;. During the process of building Dexter, AI (Gemini 2.5 and Claude Sonnet 4) were <em>consulted</em>. I would say &gt;95% of the code was &#8220;hand-written&#8221; (see <a href="https://gitlab.com/thoughtsre/dexter-agents#ai-use">here</a> for more details). You might ask why not use AI to write the code? Isn&#8217;t that what everyone is doing? Well, this is a hobby project for me to learn and sharpen my skills. Where&#8217;s the fun if everything is done my &#8220;someone&#8221; else?</p><h1><em>Who</em> is &#8220;Dexter&#8221;?</h1><p><strong>Agent Dexter</strong>, is an AI agent that runs in my own computer. It has a terminal frontend so that I can type my questions and supports voice input so that I can wake it up with the wake word &#8220;Dexter&#8221;, ask my question verbally. My queries are then be handled by an AI agent which uses a Large Language Model (LLM) to decide whether to answer my question directly or use tools available to it to answer my queries.</p><blockquote><p>While Dexter runs on my computer, the key intelligence comes from the LLM which resides on AWS&#8217;s servers that are queried using AWS Bedrock. So, if you intend to play around with it, know that not everything takes place on your computer. </p></blockquote><blockquote><p>Another thing to note is that, at present, Dexter only runs on a Mac. To make &#8220;him&#8221;<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> run on a Windows or Linux machine, you&#8217;d probably have to make some changes to the code or accept that you cannot interact with Dexter via voice commands.</p></blockquote><h1>What can Dexter do?</h1><p>Dexter has two operation modes: <strong>voice</strong> and <strong>full</strong>. See instructions on invoking the two modes <a href="https://gitlab.com/thoughtsre/dexter-agents#usage">here</a>.</p><h2>Voice mode</h2><p>In voice mode, you can wake up &#8220;him&#8221; up by saying &#8220;Dexter&#8221; and then asking your question.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gy24!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gy24!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 424w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 848w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 1272w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gy24!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png" width="1261" height="547" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:547,&quot;width&quot;:1261,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121541,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/174669596?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Gy24!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 424w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 848w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 1272w, https://substackcdn.com/image/fetch/$s_!Gy24!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2243c180-72d4-4aed-8e2b-e1791e81410e_1261x547.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;dd16a3b8-cf95-4b84-a45e-45840a73e82b&quot;,&quot;duration&quot;:null}"></div><h2>Full mode</h2><p>In Full mode, you can either wake Dexter up as in voice mode or you can directly type your queries into the terminal.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yChK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yChK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 424w, https://substackcdn.com/image/fetch/$s_!yChK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 848w, https://substackcdn.com/image/fetch/$s_!yChK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 1272w, https://substackcdn.com/image/fetch/$s_!yChK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yChK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png" width="1091" height="681" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:681,&quot;width&quot;:1091,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:142878,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/174669596?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yChK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 424w, https://substackcdn.com/image/fetch/$s_!yChK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 848w, https://substackcdn.com/image/fetch/$s_!yChK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 1272w, https://substackcdn.com/image/fetch/$s_!yChK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c552820-1e2c-4d7d-990c-9aecc8a4156b_1091x681.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;6600b118-48a1-4818-97c8-46a0a9d44cf5&quot;,&quot;duration&quot;:null}"></div><p>The user experience in full mode is still a little laggy as I&#8217;ve not implemented streaming inputs and outputs.</p><h1>Key Learnings &amp; Challenges</h1><h2>AI agents enable flexibility in user interactions</h2><p>One thing that was obvious to me is that AI agents enable user interactions that can be quite fluid and flexible. I could sort of imagine how if I were to build functionalities like searching for news and performing general tasks like adding 3 to 5 in the past, I&#8217;d have to do so with some buttons or drop-downs or tabs. Now, AI agents powered by language understanding from LLMs allow me to just speak and get things done.</p><p>Of course whether this flexibility is desired in the first place, depends on what you are building. I can imagine cases where you don&#8217;t want too many free flow conversations, and you just want to guide the user down a &#8220;narrow path&#8221;.</p><h2>Tools make agents useful but effort required in orchestration and prompting </h2><p>Another nice thing with AI agents is that you can put tools in &#8220;their hands&#8221; and it would immediately be &#8220;integrated&#8221;. For example, in this little project, I wrote a simple <code>search_news</code> tool that queries Perplexity. All I had to was to tell Dexter that this tool is now available and it would be used the next time I start Dexter up. </p><p>In a separate project that I did concurrently, I also tried using MCP servers integrated to the Strands Agent framework that I used for Dexter. The integration process was rather painless too.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>That being said, again speaking with experience from other projects involving AI-assisted coding, while effort maybe reduced when it comes to modifying code to integrate new functionalities, effort is need to prompt the AI agent properly so as to ensure that the agent&#8217;s behaviour and way of doing things is consistent over time.</p><h2>Conventions are starting to emerge in the AI agent world</h2><p>I&#8217;m also noticing that conventions are starting to emerge in the AI agent world over time. For example, dictionaries with a more or less standardised format (such as <code>{&#8220;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: &#8220;&#8230;&#8221;}</code>) in keeping a record of the conversation between the AI agent and the user are being used by different frameworks. </p><p>Though I wouldn&#8217;t call this phenomenon a &#8220;standard&#8221; yet as I feel that things will change very quickly in the AI world.</p><h2>Building Dexter&#8217;s speech-to-text capabilities was fun</h2><p>I started out wanting to use <a href="https://github.com/KoljaB/RealtimeSTT">RealtimeSTT</a> for the STT engine, as it included wakeword detection and voice activity detection (VAD) out of the box. However, it does not support Mac GPUs and running it on CPU was just too slow. That&#8217;s why I had to go with <a href="https://github.com/ml-explore/mlx-examples">MLX Whisper</a> and implement wakeword detection and VAD myself.</p><p>The experience of going through researching different frameworks and working it into an application was rewarding.</p><h2>Overcoming terminal race conditions</h2><p>If you look at <a href="https://gitlab.com/thoughtsre/dexter-agents/-/blob/main/src/dexter_agents/modes/voice/voice_chat.py?ref_type=heads">the way voice mode is run</a>, you would see that everything is run in a single Python program with queues handling data sharing.</p><p>However, when I tried to do the same for full mode, where the user can also type into the terminal (like Claud Code), I ran into a whole lot of trouble as a result of race conditions coming from the various running processes competing for the terminal&#8217;s &#8220;attention&#8221;.</p><p>In fact, this was the key challenge that took up half the development time, by my estimation. I had to give up in the end and go for the approach where I have 3 separate processes communicating over HTTP.</p><p>This is the key learning for me on the coding front this time round.</p><h1>Final Reflections</h1><p>Dexter was a fun little hobbyist project to dip my toes into the world of writing AI agent-enabled applications. While I have to admit it took more time than I thought it would and struggling with the terminal race conditions was a huge pain, it was fun figuring out how to integrate STT with Dexter. And maybe Dexter will become a platform for me to explore other Agentic AI functions such as integration with a knowledge graph or a <a href="https://mem0.ai">persistent memory layer</a> for AI agents.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>You would notice that I refer to Dexter with the pronoun &#8220;him&#8221; in quotes. This is because agentic AI has brought about the phenomenon where people deal with AI as if it were a person. But I&#8217;m still cautious about over anthropomorphising AI.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I didn&#8217;t use MCP servers for Dexter but I don&#8217;t think they will be hard to implement under the Strands Agents framework. Do not that there are potential security concerns with MCP servers.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[On using NotebookLM for learning]]></title><description><![CDATA[If your brain were a computer, it&#8217;s like having a massive amount of RAM&#8230;]]></description><link>https://blog.thoughtsre.com/p/on-using-notebooklm-for-learning</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/on-using-notebooklm-for-learning</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 12 Aug 2025 01:00:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vyR1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vyR1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vyR1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 424w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 848w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 1272w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vyR1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png" width="1456" height="670" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:670,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1027255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/170428511?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vyR1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 424w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 848w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 1272w, https://substackcdn.com/image/fetch/$s_!vyR1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf0014fa-ce91-474c-b9df-2da0ab09c4d8_3840x1767.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://notebooklm.google/">NotebookLM</a> burst onto the GenAI-enabled learning scene with its AI-generated podcast that summarises any topic for you based on digital materials you have on hand. For those who are not familiar, NotebookLM is a web application by Google that allows you to upload text sources like PDFs or websites and then allows you to ask questions about the sources, which it then proceeds to &#8220;read&#8221; through all of them to find the answers you are looking for. As the name suggests, it&#8217;s a tool that is touted for learning.</p><p>I&#8217;ve been using it for a while for my own learning on areas that are technical, such as Active Inference and more general, like Universal Basic Income. In this post, I summarise my reflections on my use of NotebookLM, tips on using it better and things I wished NotebookLM had.</p><h1>General reflections</h1><p>If I were to summarise what it felt like using NotebookLM, I would say this:</p><div class="pullquote"><p>If your brain were a computer, then using NotebookLM for learning is like having a massive amounts of RAM.</p></div><p>Of course, the AI-generated podcast wowed me initially. But it was the ability to peruse a <em>curated</em> set of information, hold it in memory and answer your question coherently that made me feel that NotebookLM was really a useful tool for learning.</p><p>I was struggling with learning Active Learning which required a very different way of thinking about machine learning that I was used to and involved knowledge in highly mathematical topics like variational learning or Partially Observed Markov Decision Processes (POMDPs). On top of that, the papers were written by neuroscientists and biologists which have a different way of expression compared to mathematicians and computer scientists <em>and</em> were writing about the application of Active Inference in neuroscience, an area which I had very little knowledge or intuition about. <em>There was no way I could hold multiple papers in my head and simultaneously summarise and synthesize the information in them.</em> </p><p>That&#8217;s when NotebookLM came to the rescue. I uploaded all the papers I could find on Active Inference and I was able to ask questions that perplexed me in some papers to which answers could be found in other papers. The Active Inference community is not very big and the author sometimes assumed that you knew what they were talking about. A memorable moment for me was when NotebookLM demystified the difference between state, policy and parameter inference for me in the Active Inference framework. NotebookLM synthesised information from multiple papers, filled in the gaps from one paper with information from another and presented the information in a coherent manner. </p><p>Another really nice feature of NotebookLM is that it is able to generate mind maps for you. Take the mind map generated from my &#8220;Universal Basic Income&#8221; notebook. It organises the main threads of information for you to explore <em>across all your sources</em>. It is also able to generate other documents like FAQs, briefing docs and timelines. I can imagine a feature like timelines to be useful to people like lawyers.</p><p>The latest feature includes generating a video summary. Something I believe visual learners would find useful.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CMGi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CMGi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 424w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 848w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 1272w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CMGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png" width="1456" height="927" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:927,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:884604,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/170428511?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CMGi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 424w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 848w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 1272w, https://substackcdn.com/image/fetch/$s_!CMGi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93b113f2-11ab-44d6-84d4-6c73318f3a5c_2855x1818.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I also didn&#8217;t feel like I was a passive recipient of knowledge while I was using NotebookLM, you know, like how you feel like you&#8217;re not really using your brain if ChatGPT is simply giving you all the answers. I was an active participant. I had to find the sources of information and I had to think of the questions to ask. It is a product which lives up to its purpose as a study companion.</p><h1>Usage Tips</h1><h2>Include a &#8220;foundational&#8221; source for technical topics</h2><p>NotebookLM can only give you answers from the sources you upload. That&#8217;s fine for many cases. However, when you are trying to learn a technical subject and all you have are academic papers, it may not be able to explain things as well to you as you would want it to, simply because it does not have the knowledge. Academic papers often assume understanding in certain foundational topics that you might not have. </p><p>For example, I uploaded a copy of <a href="https://www.microsoft.com/en-us/research/publication/pattern-recognition-machine-learning/">Bishop&#8217;s seminal book</a> &#8220;Pattern Recognition and Machine Learning&#8221; when learning Active Inference, and as a result, NotebookLM was able to explain the use of factor graphs in Active Inference better. This is because it had access to the foundational knowledge in Bishop&#8217;s book.</p><h2>Be balanced with your input sources</h2><p>Again, stemming from the fact that NotebookLM can only know what you give it, take extra care to be balanced with your sources. If all your sources portray a particular topic negatively, then you will only get answers that paint the topic in a negative light, vice versa. </p><p>This is especially important when using the &#8220;Discover&#8221; feature to collect input sources, where you can ask NotebookLM to find information sources for you with a text prompt. For example, when I was using NotebookLM to discover sources on Universal Basic Income, I said &#8220;I want to learn about the <em>pros and cons</em> of Universal Basic Income&#8221;, instead of &#8220;I want to learn <em>what is good about </em>Universal Basic Income&#8221;.</p><h2>Understand that it&#8217;s ability to answer you is limited by the LLM that supports it</h2><p>At the end of the day, the responses you get are still powered by the LLM sitting behind NotebookLM. So do note that there will be some things it cannot do. I realised this when I tried to ask it to perform a step-by-step derivation of an equation. I guess I was thinking too highly of NotebookLM at that point in time.</p><h1>Feature Wish List</h1><h2>Live data or API access</h2><p>I wish there were some way to update my sources automatically either by subscribing within NotebookLM or API access. I use Perplexity to keep up with news and I just kept thinking that it would be great if I could pipe some of these articles into notebooks in NotebookLM for me to query later. This is useful if you are keeping track of an on-going event like the Russia-Ukraine war (and you want to use the timeline feature).</p><h2>Video or image sources</h2><p>Currently NotebookLM deals predominantly with text sources. Given the progress in multimodal LLMs, it would be great if it could also parse images or videos. In this way, if I see a great talk or lecture, I could upload it and query it.</p><h2>Better math formatting</h2><p>Currently when NotebookLM saves its replies as a note, the math is shown as raw LaTeX. I don&#8217;t think it&#8217;s too much to ask for the math to be properly displayed, is it?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_HBL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_HBL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 424w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 848w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 1272w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_HBL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png" width="614" height="445.192917054986" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1073,&quot;resizeWidth&quot;:614,&quot;bytes&quot;:454752,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/170428511?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_HBL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 424w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 848w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 1272w, https://substackcdn.com/image/fetch/$s_!_HBL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F636a98d8-49f1-408b-a0f1-3d93697cb859_1073x778.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Handwriting support on tablets</h2><p>I&#8217;m someone who learns better by writing. It&#8217;d be great if I could write notes on my IPad in NotebookLM and the LLM can parse my handwritings as well. I know it&#8217;s a tall order, but hey, it is a wish list after all&#8230;</p><h1>Conclusion</h1><p>I thoroughly enjoyed using NotebookLM. Nowadays, when I find myself having too much information to process about a particular topic, NotebookLM is my go-to tool. I hope it keeps on improving as I honestly think that it is useful not just for students but also for professionals.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Objectivity. A lost virtue in the age of GenAI]]></title><description><![CDATA[I was just reading an old 2018 paper on computer vision and a thought suddenly struck me.]]></description><link>https://blog.thoughtsre.com/p/objectivity-a-lost-virtue-in-the</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/objectivity-a-lost-virtue-in-the</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 03 Jun 2025 01:00:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tOXr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tOXr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tOXr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tOXr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg" width="574" height="574" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:574,&quot;bytes&quot;:1707725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/164858499?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tOXr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tOXr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F587c7d14-034e-4725-829d-2a60ebd8b99f_2048x2048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated with Gemini</figcaption></figure></div><p>I was just reading an old 2018 paper on computer vision and a thought suddenly struck me. </p><p>Objectivity is lost in the age of GenAI.</p><p>Or at least, the facade of objectivity is lost. Depending on your personal philosophical leanings.</p><p>Take for example, Structure from motion (SfM). It used to be that people had to calculate <a href="https://thoughtsre.substack.com/p/codewalkthrough-image-stitching-with?r=2idqpb">homography matrices</a> and perform triangulation. It was cumbersome and complicated. But if you had the right information like the camera&#8217;s focal length and all, you&#8217;d arrive at a physically sensible solution. In other words, you would know the camera&#8217;s pose (rotation, position,etc.) and you&#8217;d have a measurement of the object&#8217;s distance based on hard math. Nowadays with <a href="https://segment-anything.com/">segment anything</a> and <a href="https://depth-anything.github.io/">depth anything</a> AI models, you&#8217;d be able to get a depth map of the objects in the picture with little to no work. The catch is, everything is now <em>relative</em>. The depth map (or the distance each image pixel is from the camera) is what the AI model <em>thinks</em> it is. The change in depth distance between 2 pixels is what the AI model estimate it to be <em>based on the images it has seen before.</em></p><p>The origin of this thought was the <a href="https://arxiv.org/abs/1807.03877">Deep Structured Generative Models</a> paper, which led me to think about how I would do the same with GenAI (perhaps more on this in a later post). It then led me to think about how I would use a combination of image generation with background removal to generate assets, use LLMs to generate semantic information and video generation to generate different views of the digital assets, etc..</p><blockquote><p>&#128161;</p><p>Actually I started reading the paper because I had problems with pure image generation via ChatGPT when creating thoughtsre&#8217;s logo.</p></blockquote><p>Anyway, it then occurred to me that each of these steps feels like I am asking a person to do something for me. I didn&#8217;t have to know how to calculate homography matrices anymore. Each step was essentially a black box.</p><p>This may be convenient for me. Just like how a company outsources certain functionality so that it can focus on its core business. We can take this analogy further and think about what happens when a company unwittingly outsources its core business functions in pursuit of efficiency. History is replete with bad examples. Another story for another day&#8230;</p><p>Coming back to the thesis of this article, when one asks another person to do things, the outcome is at least partially determined by the inherent biases and capabilities of the person. This is why we keep comparing between models of different companies and how they perform on different benchmarks. It&#8217;s like how we compare the design proposals of various interior designers before we decide to engage one.</p><p>Setting success criteria and measurement metrics help. But think about it, if it were a human, there is always some element of subjectivity that is due to the human. Increasingly this will be true for machines as well. The output will be colored by its training data (or experiences). If we blindly accept its outputs, then we are also accepting its subjectivity, so to speak. Just like how if you employ a director to make a movie, you are buying into his artistic direction and vision. </p><p>Going back to the analogy of interior designers, it&#8217;s like judging the designer&#8217;s capabilities by their use of space or adherence to your requirements. Two designers might use space equally efficiently but one is just better at understanding your asks, perhaps because his life experiences and tastes are just more similar to yours.</p><p>To try and define success exhaustively would be to make the work of metrics definition an industry unto itself. Then where&#8217;s the productivity gain? In any case, it&#8217;s a fool&#8217;s errand. AI&#8217;s capability will far exceed human capability of monitoring. See <a href="https://www.notion.so/AI-2027-1cc194a154ea8181bda8ddf58f260dd4?pvs=21">AI 2027</a> or <a href="https://www.axios.com/2025/05/23/anthropic-ai-deception-risk">recent incidents</a> on AI models employing blackmail to avoid shutdown.</p><p>One might argue that this is human society works. We deal with this kind of subjectivity all the time. All humans hold subjective opinions. It can work with machines as well. To wit, I say maybe. </p><p>Note that, despite the progress made, human society is still filled with suffering and unfairness. Can we assume that a human society mixed with machines will be better? Moreover, there are many more humans each developed with their unique experiences. The diversity averages things out. And human society has had a lot of time to derive common ground, albeit through painful experiences.</p><p>Can we say the same of large AI models in the hands of a few?</p><p>Others might also argue that objectivity has never existed. That is not true in select fields like mathematics. But to be honest, mostly true in everyday lives. We have, however, found ways in human society to deal with it. We have laws and cultural norms that are distilled over time.</p><p>Importantly, we have, as a species, valued objectivity and at least tried our best to uphold it. This is why we have laws that require proof beyond doubt and chain of evidence to convict crimes. That is why we pursue knowledge in hard science. That is why it&#8217;s different when we meet face to face, instead of via Zoom.</p><p>With GenAI&#8217;s ability to create realistic content, it is hard and mostly too tiring to find out what is true anymore. We will be inundated by generated content delivered to us via the well-hone machinery of social media that constantly sways our thinking and judgement.</p><p>At the end, what makes humans want to work with others is trust, which comes from cultural similarity or some other binding force like the law or familial bonds. And I think there are a few ways where we might be able to salvage objectivity. </p><p><strong>One, devise a legal framework to treat AI as persons.</strong> We urgently need to adapt our current legal framework to treat AI as persons with legal liabilities. Fortunately, we already have what we need. A company is a legal person that is not a human. Hence, just like a company needs to be capitalized and there may be humans (up to a certain point) that are liable for its actions, the same goes for AI. </p><p><strong>Two, adapt the use of blockchain as a record of AI actions and human outputs.</strong> With the proliferation of GenAI, authenticity becomes a prized commodity. This is where AI actions and human outputs need to be recorded in a globally immutable records like a blockchain. On top of that, with the help of the legal framework, the outputs of AI have to be watermarked, like <a href="https://deepmind.google/science/synthid/">SynthID</a>.</p><p><strong>Three, use small models but lots of them.</strong> One of the key dangers in the current GenAI revolution lies in the fact that only a handful of operators own and are responsible for training these models. The reason for that is that these models (even the so-called resource efficient DeepSeek models) are so expensive to run and train. There are some that would advocate stopping or slowing the development of AI. I personally don&#8217;t think that will work. The cat&#8217;s already out of the bag. In fact, I feel that the research and engineering community needs to urgently work out a way to have the same capabilities but in a much much smaller form factor. In other words, find a way to enable AI models to have high intelligence but lightweight enough to be deployed on mobile devices. Importantly, these AI models should be able to learn quickly, easily and <strong>independently,</strong> such that they are truly personal assistants to people. There should be at least as many independent AI models as there are people. To do this, we would need to shift the current AI paradigm away from the transformer-based models that are super heavy and achieve quantum leaps in AI hardware. In this way, we achieve a parallel &#8220;machine society&#8221;, where hopefully diversity of independent AI agents helps to average things out and we can seek some form of objectivity within.</p><p>That being said, it is unclear to me whether any of these methods will work in the long run. Maybe it will only serve to delay the inevitable. Take for example the third suggestion of creating a parallel machine society. How we would ensure that the resultant AI agents would really be diverse? Even if they were, would humans be seen by the machines as a cohabitant or a parasite? </p><p>I&#8217;d love to hear what you think. Leave a comment and let me know your thoughts.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Constructing a ChatGPT Learning Companion for my daughter]]></title><description><![CDATA[A minor exploration into using LLMs to help my daughter learn Chinese]]></description><link>https://blog.thoughtsre.com/p/constructing-a-chatgpt-learning-companion</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/constructing-a-chatgpt-learning-companion</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Mon, 05 May 2025 23:01:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0bd6d442-d5fb-4159-8ef8-45bb9b41d69e_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In <em>trying</em> to tutor my daughter in her school work, I&#8217;ve come to realise that teaching at the primary school level has less to do with knowledge than delivery. Young children require a lot of patience and the teacher to be able to see things from their point of view and communicate in a certain way. </p><p>Unfortunately, I am short on patience. Also, I believe the situation is aggravated when teaching one&#8217;s child because you inadvertently bring in your expectations for the child in your communication. That creates extra pressure during the tutoring sessions. Every frown, every micro-expression, is taken in by the child and can be construed as disappointment. Over time, it can generate a lot of stress.</p><p>I was then inspired by a conversation I had with a friend and I started to think about ways to alleviate this using AI. Perhaps there is a way for me to help my daughter learn, but taking my lack of communication skills and impatience out of the picture.</p><h1>A ChatGPT Chinese Learning Companion</h1><p>What I did was to construct a custom GPT on ChatGPT to help my daughter revise her Chinese spelling. The prompt used to generate the GPT, which I call &#29081;&#24072;, is shared <a href="https://gitlab.com/thoughtsre/gpt-chinese-tutor">here</a>. </p><p>Basically, I uploaded a copy of my daughter&#8217;s primary school Chinese curriculum and asked chatGPT to generate explanations, example sentences and simple multiple choice questions to test her understanding.</p><p>As this GPT is targeted for use by young children, I can&#8217;t share it publicly via ChatGPT&#8217;s marketplace as per OpenAI&#8217;s policy. You will have to set it up in your own chatGPT subscription. Instructions are shared in the <a href="https://gitlab.com/thoughtsre/gpt-chinese-tutor">repository</a> as well.</p><h1>My daughter&#8217;s initial reactions</h1><p>My daughter was really amused when I told her that I had &#8220;coded&#8221; a Chinese spelling revision spelling app for her. When she used it, she took to the interface quite naturally.  The ChatGPT function where the text is read out by a friendly voice helped ease the initial adoption challenges as well.</p><p>The fact that my daughter can ask ChatGPT to keep repeating till she understood also made it easier for my daughter to learn. ChatGPT has infinite patience.</p><p>Also, she really appreciated the ability for ChatGPT to explain things to her in English when she had a hard time understanding. Of course, I had prompted ChatGPT to only give explanations in English if needed and not by default. And the questions are not in English so she still had to read the questions in Chinese.</p><h1>What I liked about it</h1><p>The first thing I was surprised by was how well ChatGPT read the Chinese curriculum PDF that I uploaded. All the Chinese spelling words were read accurately and segmented properly by the lesson of the week. No formatting or special instructions on how to read the PDF needed. This goes to show how far AI has gotten in document reading in the last few years. See the image below to see how all the lessons are properly enumerated (I asked ChatGPT to enquire which lesson to revise before starting).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rltf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rltf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 424w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 848w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 1272w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rltf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png" width="840" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:840,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:75192,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161802256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rltf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 424w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 848w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 1272w, https://substackcdn.com/image/fetch/$s_!Rltf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06d3f4ef-dec1-4fa9-a58c-faa1736b68a9_840x620.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On the whole, ChatGPT was able to generate relevant content for each of the Chinese spelling phrases. It&#8217;s not perfect but much better than I expected.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EFRI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EFRI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 424w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 848w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 1272w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EFRI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png" width="544" height="423" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93788112-619e-42ee-8aac-f051dd2ea919_544x423.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:423,&quot;width&quot;:544,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:65641,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161802256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EFRI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 424w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 848w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 1272w, https://substackcdn.com/image/fetch/$s_!EFRI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93788112-619e-42ee-8aac-f051dd2ea919_544x423.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As instructed in the prompt, the ChatGPT learning companion always took a very encouraging tone. It was pretty encouraging for me to see the positive response my daughter had to ChatGPT&#8217;s encouragement.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mTLX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mTLX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 424w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 848w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 1272w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mTLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png" width="790" height="202" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:202,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33574,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161802256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mTLX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 424w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 848w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 1272w, https://substackcdn.com/image/fetch/$s_!mTLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8808f9a8-8b29-4bb0-8c09-1fc20311aa1f_790x202.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The text-to-speech feature made the encouragement feel more real as well. The voice was personable and gave my daughter another medium to receive information other than reading the words on screen. </p><h1>What I didn&#8217;t like about it</h1><p>I guess ChatGPT&#8217;s model isn&#8217;t too well trained on the Chinese language. (When I say &#8220;isn&#8217;t too well trained&#8221;, it probably is still better in Mandarin than the average Singaporean Chinese person.) some of the sentences constructed by ChatGPT for the multiple choice questions didn&#8217;t quite make sense. I guess it could be an artifact of ChatGPT trying to construct sentences with incorrect usage of the spelling phrases. After all, it&#8217;s still an autoregressive model that predicts the next token given the previous ones.</p><p>My next gripe is with the text-to-speech function. While it provided a higher level of personal interactivity, the voice generation took quite a bit of time and as such the lesson didn&#8217;t flow too well if we used the speech generation at every step due to the waiting. Also there was limited control over the speech generation in ChatGPT. For example, I found the speech generated to be quite fast and there was no way for me to slow it down.</p><p>Lastly, because the behaviour of the custom GPT is entirely controlled via the instructions given to it during its construction (see <a href="https://gitlab.com/thoughtsre/gpt-chinese-tutor#instructions">&#8220;Instructions&#8221; section</a> in the repo), changing the prompt instructions can result in unexpected change in the flow of the lesson. </p><p>For example, I tried to support two lesson modes. One for revision of the entire curriculum where only 5 questions are asked for a short exercise and another just for the spelling test of the week where all 10 phrases should be revised. I found that changing the instructions in one would result in changes in the way the other is presented.</p><p>That being said, I see the above difficulties/shortcomings as engineering challenges that can be overcome by current techniques and by OpenAI deciding to open up more features. They are not insurmountable. It&#8217;s simply a matter of enable more features for the public.</p><h1>What I thought would have liked to have</h1><p>Following from the last shortcoming mentioned, I would have liked to be able to split the lesson into small independent parts that I can prompt engineer separately but yet bring together under one coherent agent. This is will overcome the problem of changing the prompt in one part affecting behaviour in other parts and is possible using techniques like <a href="https://www.anthropic.com/news/model-context-protocol">Model Context Protocols (MCP)</a> and <a href="https://google.github.io/A2A/">Agent2Agent (A2A)</a> for agents to communicate with each other. And of course, OpenAI to open up these features to the public.</p><p>I would also have loved the ability the specify buttons for standard responses, such as &#8220;Can you explain it to me again?&#8221;. My daughter was spending quite a bit of time typing.</p><p>Lastly, I would have loved the ability to introduce other medium of communication like writing and drawing because learning Chinese requires the student to learn how to write the Chinese characters. This would of course be more involved in terms of software engineering and would require ChatGPT to quickly be able to read my daughter&#8217;s handwriting.</p><h1>Looking into the future and my personal reflections</h1><p>As I reflect on this short foray into using ChatGPT for education, the first thing I realise as I was committing my &#8220;code&#8221; to the repo is that the &#8220;code&#8221; is just the prompt. Of course, I understand that this is just a hobbyist level prototype and nowhere near production grade. But heck, this would have been impossible a few years ago. The ability to generate natural sounding content with high-level instructions is quite phenomenal.</p><p>Combined with non-text content generation like videos, I think AI really has a chance to make a difference in education. There will be debate on whether it makes our children lazy or take away their ability to think critically. To that, I&#8217;m still not quite sure yet personally. Maybe that will be true. Maybe the effect will be different at different levels of learning.</p><p>However, in my case, my personal take is that I&#8217;m not asking AI to solve problems for my daughter. I&#8217;m not asking ChatGPT to think <em>on her behalf</em>. What I&#8217;m asking ChatGPT to do is to be a companion to my daughter, helping her think and learn while at the same time not having to deal with my shortcomings as a teacher.</p><p>I can imagine how this might be helpful for students who may not have access to private tuition. Or how AI learning companions can be equipped with the latest/best teaching methodologies that achieve the best learning outcomes for our students.</p><p>For now, some of the shortcomings such as not being able to practice <em>writing</em> Chinese, will stop me from using this for my daughter&#8217;s Chinese learning. Also, she has enough homework already. </p><p>Maybe I&#8217;ll use it when we are on a holiday, just to keep up a low level of engagement with schoolwork and prevent too much backsliding. </p><p>Just maybe&#8230;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[What does it mean to play in the AI space?]]></title><description><![CDATA[What does AI land look like anyway?]]></description><link>https://blog.thoughtsre.com/p/what-does-it-mean-to-play-in-the</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/what-does-it-mean-to-play-in-the</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 12 Apr 2025 16:02:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1yPZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Everyone needs to <em>play in the AI space</em> these days. But what does that actually mean? How can we decide <em>how </em>to play, if we don&#8217;t know the rules of the game or the size and shape of the field? What does this magical <em>AI land</em> where nothing is impossible actually look like? </p><p>Here&#8217;s how I think about AI land and how it looks like. </p><p>Broadly speaking, I see AI <em>land</em> being made up of three interconnected <em>realms</em>: <strong>algorithms</strong>, <strong>hardware</strong> and <strong>applications</strong>, within which developments in what we generically call <em>AI land</em> exist.</p><p>Each of these realms have unique characteristics. But the thing is, activities within each realm affect each other, sometimes constructively and sometimes destructively.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1yPZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1yPZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1yPZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png" width="464" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:464,&quot;bytes&quot;:2855953,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161163888?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1yPZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1yPZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18085f2d-f8ee-4221-adb9-bd1451ae7e58_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Three realms of AI land (generated by ChatGPT)</figcaption></figure></div><h2>Realm of Algorithms</h2><p>The realm of algorithms is inhabited by citizens such as the Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), Graph Neural Networks, Bayesian Optimization, the Transformer (who in the last few years has become an all important inhabitant of the algo-realm), etc.. </p><p>The work carried out here is in discovering new algorithms, improving existing algorithms and constructing the next AI paradigm. The publication of the <a href="https://arxiv.org/abs/1706.03762">Transformer architecture and attention</a> is a work in creating new algorithms. Extending the context window of large language models (LLMs) using <a href="https://arxiv.org/abs/2309.00071">YaRN</a> is about improving an existing algorithm. Work on <a href="https://sites.google.com/view/sumiowatanabe/home/singular-learning-theory">Singular Learning Theory</a> and <a href="https://www.activeinference.institute/">Active Inference</a> might be considered as constructing the next AI paradigm.</p><p>Those seeking to enter the realm of algorithms need to have enough data to train their models and the requisite mathematical prowess. The journey into the realm is not long but also not for the faint-hearted.</p><h2>Realm of Hardware</h2><p>The realm of hardware is a strongly fortified one which is extremely hard to enter. The entrance fees are high in terms of capital expense and chip making expertise and capabilities. Unless you come with deep pockets to play the game over a long time, I suggest you do not make the journey. </p><p>Its main output are computing infrastructure and chips on which AI models are run. Examples include the <a href="https://groq.com/why-ai-requires-a-new-chip-architecture/">GroqChip</a>, <a href="https://aws.amazon.com/ai/machine-learning/inferentia/">AWS Inferentia</a> and, who could forget, Nvidia GPUs.</p><p>That being said, although the journey is arduous, great rewards are promised to those who dare and can. The barrier to entry protects those who are already in. Case in point, Nvidia&#8217;s dominating position in AI.</p><h2>Realm of Applications</h2><p>The realm of applications is like a trade and business hub where money, people and resources flow through freely. It doesn&#8217;t take much to enter but it is also potentially very lucrative to those who can gain enough scale and volume.</p><p>Like a trade hub that is responsible for combining multiple semi-finished inputs into final products to be delivered to end-users, the realm of applications seek to take existing AI techniques and apply them to solve real-world problems. Examples include <a href="https://cloud.google.com/vision/docs/detecting-landmarks">Google&#8217;s landmark detection</a> and <a href="http://v0.dev">v0.dev</a> for AI-assisted frontend development.</p><p>The key to being successful in this realm is to have access to customers who are willing to buy your product, domain expertise to adapt AI models to become truly useful for industry participants and the engineering excellence to deliver products quickly.</p><h2>Interactions of the three realms</h2><p>The characteristics of the three realms are summarised below. The thing to note is that the realms are not isolated and are constantly interacting with each other.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GKAS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GKAS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 424w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 848w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 1272w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GKAS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png" width="1345" height="743" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:743,&quot;width&quot;:1345,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185387,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161163888?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GKAS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 424w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 848w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 1272w, https://substackcdn.com/image/fetch/$s_!GKAS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3552eed0-adc8-4843-9c02-bc404c16352d_1345x743.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Overview of the three realms</figcaption></figure></div><p>Better hardware enables scale for AI applications. While innovative application of AI in real-world problems drives the need for hardware to keep up. For example, the rise of AI fuelled the creation of specialised chips like GroqChip.</p><p>The desire to embed AI into end-user applications will drive the need for better algorithms while better algorithms allows developers to be more innovative with the applications they develop. Using the LLM revolution as an example, the rise of LLMs engendered a slew of innovation in the way people do graphic design (<a href="https://www.adobe.com/ai/overview.html">Adobe AI</a>), coding (<a href="https://github.com/features/copilot">GitHub CoPilot</a>), consume news (<a href="http://perplexity.ai">Perplexity</a>) and many more.</p><p>Algorithms in turn drive demand for AI hardware. Just look at how the demand of Nvidia&#8217;s GPU drove the meteoric rise of Nvidia&#8217;s stock price. And better AI hardware of course allows faster or more effective training of AI models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AQIb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AQIb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 424w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 848w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 1272w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AQIb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png" width="1121" height="891" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:891,&quot;width&quot;:1121,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60268,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161163888?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AQIb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 424w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 848w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 1272w, https://substackcdn.com/image/fetch/$s_!AQIb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221eff4a-d9c0-4b8d-b41b-ff7eaa5ec64a_1121x891.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Of course, it doesn&#8217;t mean the inhabitants of each land are necessarily friends, while they often are. For example, DeepSeek, an upcoming citizen of the the realm of Algorithms, has negatively affected Nvidia, a Hardware realm royalty, by making LLMs extremely cost and resource efficient.</p><h2>Realm overlaps and their prominent inhabitants</h2><p>The three realms not only interact, they also overlap. See the diagram below which shows the prominent citizens of each realm. Note that the list is by no means exhaustive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j7jh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j7jh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 424w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 848w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 1272w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j7jh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png" width="1248" height="801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:801,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96959,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/161163888?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j7jh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 424w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 848w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 1272w, https://substackcdn.com/image/fetch/$s_!j7jh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F884d2a51-9920-40d9-bfaf-3a3fef63fbec_1248x801.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I call the space where algorithms and applications overlap the land of <em><strong>Innovative Players</strong></em>. These are inhabitants who take new algorithms and apply them to solve new problems. Like how OpenAI took the world by storm with LLMs. Of course, as the novel applications become mainstream, innovative players move to become rooted residents of the realm of Applications.</p><p>The overlap between algorithms and hardware is where <em><strong>Demonstrators</strong></em> live. Often these are hardware vendors seeking to demonstrate the relevance of their products to the new algorithms. An example is how <a href="https://developer.nvidia.com/blog/next-generation-of-flashattention/">Nvidia optimized Flash Attention on CUDA</a>.</p><p>Then you have the space where residents of the realm of Applications are forced into the hardware realm because of the need to deploy AI applications (or help others deploy) at scale. These are <em><strong>Large-scale Operators</strong></em> like AWS and Google.</p><p>Lastly, there is a space where all three realms meet and I don&#8217;t have a name for. The only resident I can think of is Google Deepmind. I must stress that this is probably due to my limited knowledge rather than an objective truth.</p><h2>Conclusion</h2><p>In conclusion, I see the developments in AI broadly fall into the three categories of algorithms, hardware and applications. While there are significant overlaps in how AI industry participants play in each of the three categories, there are unique characteristics of each category that should inform how we <em>play in the AI space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[The real change is at the man-machine interface]]></title><description><![CDATA[LLMs and GenAI are but an opening act...]]></description><link>https://blog.thoughtsre.com/p/the-real-change-is-at-the-man-machine</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/the-real-change-is-at-the-man-machine</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 29 Mar 2025 07:49:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!16k3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Introduction</h1><p>When I think about changes that are considered &#8220;revolutionary&#8221; or &#8220;phase-changes&#8221; from the past, I often find it useful to frame my thinking in terms of <em>interfaces</em>. It is important to notice such changes because when they occur, it often ignites a series of cascading changes that opens up new, not to mention unexpected, ways of working and living and the world is changed beyond recognition at the end of it.</p><h1>Interfaces&#8230;</h1><p>For example, the mechanization of agriculture was a phase change at the <em>interface of man and his environment</em>. The result was that man can produce much more food with much less effort. What used to be an all-consuming, back-breaking, do-or-die activity for all humankind became an activity which produces so much that in much of the rich world there is an excess of food. Of course the spoils of production is obviously not evenly distributed. But that is not what I am talking about. The release of manpower to other activities drove innovation not only in agriculture but in other areas as well. The ready provision of food allowed man to focus on higher order needs and wants. </p><p>Another interface is that of man and his fellow man, the <em>man-man interface</em>. The rise of social networks changed the way people interacted with each other. But it also gave rise to new business models like online advertising and allowed people to participate in the lives of others. Not to mention, along with social networks also came social engineering and other more insidious developments. Safe to say, the post-social network world is also a very different one from before, for those old enough to remember a time when the best one could do to stay in touch is to pick up the phone.</p><h1>Focusing on the man-machine interface</h1><p>There are many of such interface changes in history if we care to look but last interface, which is the one I&#8217;d like to draw your attention to, is the <em>man-machine interface</em>. The internet represents a change in the man-machine interface. A person can now access information and communicate digitally and remotely via computers. And talking about cascading change, the internet also spawned social networks (a change in the man-man interface). </p><p>The iPhone also represents a change in the man-machine interface. Instead of typing on keyboards, people started to interact with machines in a tactile manner. Now, almost all personal electronic devices follow the same interaction model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!16k3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!16k3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!16k3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!16k3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!16k3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!16k3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3470001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thoughtsre.substack.com/i/159827748?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!16k3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!16k3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!16k3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!16k3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F531aa1c1-ed0f-4035-b7c6-b32e42720dab_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generated using ChatGPT</figcaption></figure></div><p>In my mind, another sea change in the man-machine interface is currently getting started and LLMs/GenAI is but the opening salvo. For the first time, humans can interact with machines in one of the most natural medium possible, speech. I&#8217;m not talking about speech recognition, but machines that can understand the nuances of our language.</p><p>Of course, whether the future of this man-machine interface will still be transformer-based LLMs or Large Concept Models (LCMs) or &#8220;Agentic AI&#8221; or whatever new-fangled names people come up with, I don&#8217;t know. I personally doubt it will be what we see today. But what I am watching out for is what this change will mean in the longer term and how it might change the way we live and work. I&#8217;m pretty sure that this will engender new ways of working, interacting with each other (man-man interface) and novel business models that we cannot even begin to imagine. </p><p>If we consider also how the fields of AR/VR and wearables are advancing, does it mean that we will be able to speak our minds and let machines take on a lot of the more mechanical work? When we think about this, then we realize how the Google Glass was ahead of its time. While it was an innovation in the hardware space, it lacked the power of understanding that LLMs are now beginning to demonstrate and the interaction with the Google Glass was still unnatural. Now that the software side has caught up, we are seeing a resurgence in interest in AR/VR devices like the Meta Orion. I believe this is an iPhone-like moment where in 10-20 years time, the devices will be somehow wearable. The iPhone-like form-factor of mobile devices will go the way of the Nokia 3210. </p><p>The combination of increasingly competent (I won&#8217;t use the word &#8220;intelligent&#8221;) machine learning models with robotics also offers an exciting outlook. At present, the recent explosion of interest in LLMs/GenAI is constrained to the virtual world. But make not mistake, the field is constantly pushing the boundaries and trying to realize itself in the physical world. That will be through robots. Increasingly, we will be seeing robots, aided by flexible and powerful models, which can function autonomously in complex environment. On top of that, building on the foundations of telecommunications, machines can be almost omnipresent (think the Internet-of-Things). Brings to mind <em>Skynet</em>. True. But let&#8217;s not be too hasty in judging. Remember, it is almost impossible to foresee the changes that will be brought about by such <em>phase changes</em>. Oftentimes, we can only live it and then discuss it after the fact.</p><p>What I&#8217;ve discussed is just on the capability side of things. What about new business models <em>applying</em> these new capabilities that might be spawned out of this? After all, this is where rubber meets the road, where actual change in our lives really happen.</p><p>Already we are seeing the creative industry being disrupted. Copywriters, website designers, digital artists and many more creative vocations are being revolutionized as we speak. The shock that I had 6-9 months ago was that it was an industry that I had not foreseen been disrupted by AI until LLMs/GenAI. I had always thought, like many others, that it was the more mundane and repetitive work that will be disrupted. It was a lesson to me never to be complacent.</p><p>Manufacturing and logistics are also not spared as machines become more competent.</p><p>And what about the healthcare or home care industry? What used to be a high touch industry that is expensive for consumers may become increasingly accessible via personal assistants that people can <em>talk to</em>, remote assistance by human operators who can take care of multiple clients at once via AR/VR or even autonomous general purpose robots. Of course, we&#8217;d have to tune our definition of &#8220;touch&#8221; and &#8220;contact&#8221;. But that could lead to greater accessibility of crucial services.</p><p>Not all is well and good, no doubt. Man has the unique ability of using tools for the noblest means and his basest desires. Like how the internet drove the rise of online pornography. I&#8217;m sure that the recent advances in LLMs/GenAI will lead to nefarious use-cases that we cannot even begin to imagine. An example is the rise of deepfake scams, made almost indistinguishable from the &#8220;real&#8221; thing in the virtual world.</p><h1>Let&#8217;s be careful about this&#8230;</h1><p>This is why I believe regulation is important here. Just like how we have regulation to protect our children on the internet. Regulation, constructed by an inclusive  and mature community, is an avenue via which the world can take a stand against unbridled innovation. There are a lot of nuances and caveats on this point alone and it is a discussion for another time. </p><p>Those who know me would know that I am generally an advocate for innovation. But this is different. </p><p>We stand at a point where the minimum capability requirements are being satisfied across various technological (and non-tech) domains at the man-machine interface that would unleash a torrent of change. We don&#8217;t need to look far for an example of how such a thing has happened before. The whole big data revolution occurred because minimum requirements in data production (the internet) and data processing capabilities (data centers, efficient CPUs/GPUs and big data frameworks like Apache Hadoop and Apache Spark) were satisfied allowing people to process data on an unprecedented scale. I believe stand at that point right now. We need to be prepared. This is what Bezos would call a <a href="https://www.businessinsider.com/jeff-bezos-on-type-1-and-type-2-decisions-2016-4">Type 1</a>-like decision point.</p><h1>Overlapping changes</h1><p>As Vaclav Smil described in his book &#8220;<a href="https://mitpress.mit.edu/9780262539685/growth/">Growth</a>&#8221;, changes don&#8217;t happen linearly and often overlap. They also have a fractal nature to them. I.e. you will have growth cycles within growth cycles. It&#8217;s just like how the iPhone man-machine interface change is somehow embedded in the internet revolution (another man-machine interface change) and is an aider and abettor to the social network man-man interface change.</p><p>So if you&#8217;ve read up to this point, please do not think I&#8217;m saying that the world is a clean order sequence of changes. They are all interlinked and embedded within each other. Where the world ends up depends on a balance of forces between these changes. This is why such changes always changes the world in a way we cannot foresee.</p><p>Along the way, there are still many technical challenges to overcome. There will be many hiccups. It will seem as if a lot of resources are wasted. What is clear that the recent LLM/GenAI developments have shown that it is <em>possible</em> for machines to understand us. Just like how once man landed on the moon, going to outer space and colonizing other planets is no longer a pipe dream. The next phase of development will go beyond what we see now.</p><h1>Cascading changes</h1><p>This change in man-machine interface kicked off by LLMs/GenAI will trigger other changes. One of the changes is advancement in computational efficiency, as witnessed by LLM developments from DeepSeek and also novel computing architectures like those from <a href="https://corticallabs.com/">Cortical Labs</a>. Another change, when we think of how general purpose robots now have a viable market, is in battery technology where you can have high energy density and safe batteries to support home use robots. These are what Christensen would call <a href="https://www.christenseninstitute.org/theory/disruptive-innovation/">sustaining innovation</a> that would reinforce the LLM man-machine interface change.</p><p>There will also other related changes. One that I can think of is the need for authenticity guarantees with the proliferation of machine generated content. This is where distributed ledgers and NFTs might make a comeback. I can imagine a future where human generated content will be signed on a global blockchain. It will also result in human interaction and performances becoming highly valued.</p><p>As machines acquire more and more &#8220;thinking&#8221; abilities and are able to interface with our lives in a very physical or real way, we might also need to start considering them as &#8220;persons&#8221; with liabilities. This might spur a change in our legal framework which is a core piece of the man-man interface.</p><h1>Look beyond LLMs, GenAI or Agentic AI</h1><p>It might seem easy to dismiss LLMs as less then useful, overwhelmed by all the new models and benchmarks coming out on a daily basis or think that agentic AI&#8217;s lack of robustness makes it useless. But that is to see things from today&#8217;s perspective and to focus on the wrong things, in my opinion.</p><p>The way I see it, LLMs, GenAI and Agentic AI are only the <em>opening act</em> to a broader change in the man-machine interface. And that is what we need to pay attention to rather than specific developments in specific technologies.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[SuperGlue: An AI model for visual feature matching]]></title><description><![CDATA[Using AI to improve a fundamental process in Computer Vision]]></description><link>https://blog.thoughtsre.com/p/superglue-an-ai-model-for-visual</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/superglue-an-ai-model-for-visual</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 15 Feb 2025 09:45:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Feature matching between images is a process where correspondences between key points in disparate images is established. It enables one to understand how the view point containing an object in one image needs to be translated and rotated to achieve the view point in another image containing the same object but from a different perspective. It is a fundamental step in advance computer vision processes like <a href="https://thoughtsre.substack.com/p/codewalkthrough-image-stitching-with?r=2idqpb">image stitching</a> and <a href="https://thoughtsre.substack.com/p/experiment-structure-from-motion?r=2idqpb">Structure-from-Motion</a> (SfM). </p><p>In general, the greater number of matches that can be discovered between key points of two images, the better we can calculate the transformation matrices, such as the <a href="https://en.wikipedia.org/wiki/Homography_(computer_vision)">homography matrix</a> or the <a href="https://en.wikipedia.org/wiki/Fundamental_matrix_(computer_vision)">fundamental matrix</a>, that are needed to understand the perspective changes between the images.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This is why I was excited to try out <a href="https://huggingface.co/docs/transformers/main/en/model_doc/superglue">SuperGlue</a>, an AI model that helps you do feature matching, when I first heard about it. I was curious to find out whether it would do better than the traditional <a href="https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html">OpenCV methods</a> I was using in my <a href="https://thoughtsre.substack.com/p/codewalkthrough-image-stitching-with?r=2idqpb">previous experiments</a>.</p><p>The code for this experiment is <a href="https://gitlab.com/thoughtsre/trying-out-superglue/-/blob/master/demo_code.ipynb?ref_type=heads">here</a>. It&#8217;s not very complicated. Most of the code using SuperGlue is that given on the HuggingFace website.</p><h1>Overview</h1><p>In this experiment, I compare the feature matching results from SuperGlue with that from an ORB feature detector + brute force matcher. As a demonstration, I have three simple test cases which I ran through both approaches.</p><ol><li><p>A scene of an outdoor building. This is the demo picture from HuggingFace.</p></li><li><p>Two overlapping pictures of a mountain. This is taken from the <a href="https://cmsc426.github.io/">University of Maryland CMSC426 course</a>. I chose this as the mountain image contained rough textures that might be hard to match uniquely if you only look at local pixel features.</p></li><li><p>Cropped pictures of a pagoda at Mount Fuji. The pagoda has rich features from a human perspective and contains certain background features that might be useful to a human in matching the images. So I was wondering whether a machine can do the same.</p></li></ol><p>For the OpenCV approach, I used sensible parameters that I found effective in my <a href="https://thoughtsre.substack.com/p/codewalkthrough-image-stitching-with?r=2idqpb">previous image stitching experiment</a>.</p><h1>Results</h1><p>The matching results for the demo scene from Hugging Face is shown below. Not only were the number of matches found by SuperGlue is much greater than that by OpenCV, it also seems like the results by SuperGlue were more accurate. For example, the tip of the dome in the image on right was matched to the base of the dome in the image on the left by OpenCV.  </p><p>For the SuperGlue results, green, yellow and red lines indicate a good, OK and not-so-good match, respectively.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vmv1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vmv1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 424w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 848w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 1272w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vmv1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png" width="1081" height="924" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:924,&quot;width&quot;:1081,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1151008,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vmv1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 424w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 848w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 1272w, https://substackcdn.com/image/fetch/$s_!Vmv1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9df517cd-d007-4ee1-816a-d00299d12eb9_1081x924.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The results for the mountain were similar. I was surprised by the number of matches found by SuperGlue. For a human it might be easy because we take in the entire picture and match points based on its relative position to key contextual features such as the peak of the mountain. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dgK6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dgK6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 424w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 848w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 1272w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dgK6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png" width="830" height="1203" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1203,&quot;width&quot;:830,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1259008,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dgK6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 424w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 848w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 1272w, https://substackcdn.com/image/fetch/$s_!dgK6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f72943f-eb7b-4dba-b3df-ce65b3224d5d_830x1203.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Again, the same for the mount Fuji test case. One thing that I found quite interesting as well is that SuperGlue managed to match the corners of the pagoda roof (indicated by the red squares). ORB was not able to match those features.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oZ4W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oZ4W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 424w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 848w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 1272w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oZ4W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png" width="800" height="1249" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc009e2e-f183-4541-88f0-1b661362344b_800x1249.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1249,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1696752,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oZ4W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 424w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 848w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 1272w, https://substackcdn.com/image/fetch/$s_!oZ4W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc009e2e-f183-4541-88f0-1b661362344b_800x1249.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Reason behind SuperGlue&#8217;s effectiveness</h1><p>Traditional methods such as ORB uses local features as key point descriptors. One can think of it as embeddings in today&#8217;s LLM parlance. These descriptors are often a simple function of the raw pixel values. For example, ORB uses the <a href="https://medium.com/@deepanshut041/introduction-to-brief-binary-robust-independent-elementary-features-436f4a31a0e6">BRIEF descriptors</a>, albeit with some additional machinery to account for feature rotation. Nonetheless, it is very much a description of the <em>local</em> environment of the keypoint in the image. And the matching of key points are based off these local descriptors.</p><p>SuperGlue on the other hand, uses an attention graph neural network. This graph neural network not only connects the key points within each image but also key points from the other image. This allows contextual information from the rest of the image to be taken into account. And I believe this makes all the difference in performance. This is why the corners of the pagoda tower can be detected as it is able to understand how that key point is related to all the other key points in the images.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lywW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lywW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 424w, https://substackcdn.com/image/fetch/$s_!lywW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 848w, https://substackcdn.com/image/fetch/$s_!lywW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 1272w, https://substackcdn.com/image/fetch/$s_!lywW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lywW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png" width="827" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:216,&quot;width&quot;:827,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56205,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lywW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 424w, https://substackcdn.com/image/fetch/$s_!lywW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 848w, https://substackcdn.com/image/fetch/$s_!lywW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 1272w, https://substackcdn.com/image/fetch/$s_!lywW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77e5278e-add0-421a-a82c-1176d3f3c744_827x216.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">SuperGlue Architecture. Taken from Sarlin, P. et al., 2020, SuperGlue: Learning Feature Matching with Graph Neural Networks. https://arxiv.org/abs/1911.11763</figcaption></figure></div><h1>Final remarks</h1><p>I was pleasantly surprised by the performance of SuperGlue and I can see how it can help boost the quality of the results of Computer Vision processes that depend on feature matching. I recall having issues calculating the homography matrix accurately due to a lack of matched keypoints, which then resulted in wildly distorted stitched images.</p><p>It is clear that using AI models in this way or maybe even multiple AI models in concert to replace previous methods of math/optimization-based calculations can potentially speed up many computationally heavy processes and can be very beneficial to many everyday processes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Experiment: Structure-from-Motion using COLMAP]]></title><description><![CDATA[Generating 3D structures from 2D images]]></description><link>https://blog.thoughtsre.com/p/experiment-structure-from-motion</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/experiment-structure-from-motion</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Mon, 13 Jan 2025 02:21:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://en.wikipedia.org/wiki/Structure_from_motion">Structure-from-Motion</a> (SfM) is the process of generating 3D structures via information garnered from a collection of 2D images. In this experiment, I will walk through the use of the <a href="https://colmap.github.io/">COLMAP</a> application to do SfM as well as describe my learnings along the way.</p><p>This post is part of my journey to learn about SfM. See related posts <a href="https://thoughtsre.substack.com/t/structure-from-motion">here</a>.</p><h1>Overall process</h1><p>COLMAP uses a technique that is probably state-of-the-art around 6-7 years ago. I will be experimenting with new techniques such as <a href="https://huggingface.co/blog/gaussian-splatting">3D Gaussian Splatting</a> that have come up in the last couple of years in later posts.</p><p>The purpose of going through the use of COLMAP is to how the field came to be and also the fundamental concepts involved such as <a href="https://en.wikipedia.org/wiki/Epipolar_geometry">epipolar geometry</a>. </p><p>The first step is to identify good image features that would be easily identifiable from different perspectives. This is done usually via the <a href="https://en.wikipedia.org/wiki/Scale-invariant_feature_transform">SIFT</a> or <a href="https://en.wikipedia.org/wiki/Oriented_FAST_and_rotated_BRIEF">ORB</a> algorithms. For an example of how this is used, see my <a href="https://thoughtsre.substack.com/p/codewalkthrough-image-stitching-with?r=2idqpb">previous post</a> on image stitching.</p><p>The features are then matched across images. The OpenCV website has a great series of <a href="https://docs.opencv.org/4.x/db/d27/tutorial_py_table_of_contents_feature2d.html">tutorials</a> on this. The difficulty here is finding good image pairs to match each other&#8217;s features with. A series of techniques/optimizations such as using robust image descriptors are applied here. This is because, the better the matching the better the results in the later steps.</p><p>The next step is to find out the camera <a href="https://en.wikipedia.org/wiki/Pose_(computer_vision)">poses</a> of each image. Basically, what rotation and position relative to the object (or other cameras) is the camera at in the picture. This is done by solving for the <a href="https://en.wikipedia.org/wiki/Fundamental_matrix_(computer_vision)">fundamental matrix</a>, which is kind of like the <a href="https://en.wikipedia.org/wiki/Homography_(computer_vision)">homography matrix</a>, using the matched features in the previous step and epipolar geometry. In essence, you figure out what transformations are needed to move the features in one image to the matched feature positions in another. This then gives access to fundamental quantities such as the rotation and translation matrices as well as estimated intrinsic camera parameters (such as focal length).</p><p>The depth of the matched features are then found via triangulation. This gives a <a href="https://github.com/andreihar/3d-reconstruction">sparse representation</a> of the 3D structure. Why &#8220;sparse&#8221;? This is because only the matched feature depths are estimated and not for every pixel in the image. The difficulty here is that the depth of the features have to be <em>globally</em> consistent. All the images must agree.</p><p>The last step is dense reconstruction. In COLMAP, an algorithm called <a href="https://en.wikipedia.org/wiki/PatchMatch">PatchMatch</a> to find correspondences between different image patches. Then armed with the rotation and translation matrices and the camera intinsic parameters, the depth <em>and</em> normals of each patch is then figured out. For an explanation of the process see <a href="https://link.springer.com/chapter/10.1007/978-3-319-46487-9_31">this paper</a>. The normals are important as you can&#8217;t just know where the surface is if you want to construct a 3D mesh, you also need to know which direction it is <em>facing</em>. Then with both depth and normal a 3D mesh can be constructed using a process like <a href="https://hhoppe.com/poissonrecon.pdf">Poisson Surface Reconstruction</a>.</p><p>For detailed explanations, I have found <a href="https://www.research-collection.ethz.ch/handle/20.500.11850/295763">Schonberger&#8217;s thesis</a> to be a great resource. I dare not expound in details the inner workings of each step as I am not an expert. My understanding here may not be fully accurate / complete. The description above is simply an attempt to outline the overall process and break a big black box into a series of smaller black boxes according to my understanding. </p><p>There are also many variations to the above-mentioned process. Computer vision is a vast field.</p><p>As an aside, I had naively wanted to code SfM from scratch initially. That though was quickly vanquished once I saw how complicated the SfM pipeline was and realised that it would months of work before I saw any result. Something I might do if I were doing a postgraduate in Computer Vision (CV) but not as a CV hobbyist.</p><h1>Experiment Steps</h1><ol><li><p><a href="https://colmap.github.io/install.html">Install COLMAP</a></p><ol><li><p>I built COLMAP from source so that it can utilise my GPU</p></li></ol></li><li><p>Download <a href="https://demuc.de/colmap/datasets/">test images</a></p><ol><li><p>In this experiment, I used the &#8220;South Building&#8221; images</p></li></ol></li><li><p>Follow instructions in &#8220;<a href="https://colmap.github.io/tutorial.html#quickstart">Quickstart</a>&#8221;</p><ol><li><p>The auto-reconstruction mode is pretty straightforward. Simply point COLMAP to your project and image folders</p></li><li><p>There are more features that can be explored if you want fine-grained control or only want to use certain steps. I didn&#8217;t explore too much as I want to move on quickly to Gaussian Splatting.</p></li></ol></li><li><p>Visualise the results using <a href="https://www.meshlab.net/">MeshLab</a></p></li><li><p>[Optional] Visualise using <a href="https://www.blender.org/">Blender</a></p><ol><li><p>The .ply files can be readily imported into Blender.</p></li></ol></li></ol><h1>Results</h1><h2>Processing Time</h2><p>I didn&#8217;t keep exact time but the total processing time for the 128 images in the &#8220;South building&#8221; dataset took many hours. Most of the time was spent in the dense reconstruction stage. This is one of the reasons why I want to move on quickly to Gaussian Splatting, which only requires the results from the sparse reconstruction and promises quick rendering.</p><h2>Final output</h2><p>The point cloud from the sparse reconstruction is shown below. It&#8217;s really just a collection of points and doesn&#8217;t cover every single part of the building.</p><p>The red squares are the recovered camera orientations for each of the input images.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iQ7l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iQ7l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 424w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 848w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 1272w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iQ7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png" width="1262" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7c03712-be52-4660-a9dd-4251ff340475_1262x782.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1262,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:552572,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iQ7l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 424w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 848w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 1272w, https://substackcdn.com/image/fetch/$s_!iQ7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7c03712-be52-4660-a9dd-4251ff340475_1262x782.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After dense reconstruction and Poisson mesh generation, a 3D surface model is generated. This starts to look like something that you would see on Google Maps or a video game in the 2000&#8217;s. Note that not all parts of teh building is generated. Parts of the roof, where there isn&#8217;t enough information is missing from the generated mesh.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Jle!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Jle!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 424w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 848w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 1272w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Jle!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif" width="1262" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06382559-4cf5-467c-a881-945064f2c325_1262x782.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1262,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2044962,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Jle!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 424w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 848w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 1272w, https://substackcdn.com/image/fetch/$s_!2Jle!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06382559-4cf5-467c-a881-945064f2c325_1262x782.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Image requirements</h2><p>I had initially tried to do the reconstruction with my own images but the generation failed miserably. This is because the object of reconstruction, at least using COLMAP&#8217;s SfM pipeline, should be dominant in the images. Just like how the South Building takes up most of the pixel real estate in the images. Otherwise, most of the information will be coming from peripheral objects such as the table the object is on.</p><p>One way to circumvent this might be to segment the object out from the background before performing SfM.</p><h1>Final thoughts</h1><p>I honestly had no idea what I was getting into when I said I wanted to attempt SfM. If I had, I probably would have picked another topic to explore. Nonetheless, it was refreshing to learn about topics like epipolar geometry and how that could be used to solve for real-world parameters. </p><p>In digging deeper about SfM, I&#8217;ve also realised that traditional methods, like those demonstrated in OpenCV tutorials, have very high information requirements to begin. For example, the depth estimation tutorial would require that you know the camera&#8217;s intrinsic parameters which have to be obtained using a separate calibration procedure.</p><p>While the geometric approach can allow for very precise real-world results, I sometimes wonder whether the use of foundation vision models, such as <a href="https://github.com/DepthAnything/Depth-Anything-V2">DepthAnything</a> or <a href="https://github.com/facebookresearch/sam2">SegmentAnything</a> can be used in a pipeline to generate something that is useful and quick. That might be something I&#8217;ll explore after I&#8217;m done with Gaussian Splatting.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[CodeWalkthrough: Image Stitching with OpenCV]]></title><description><![CDATA[A pit-stop on my journey to Structure-from-Motion]]></description><link>https://blog.thoughtsre.com/p/codewalkthrough-image-stitching-with</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/codewalkthrough-image-stitching-with</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 31 Dec 2024 07:21:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89aaa682-fe97-4c01-bd83-b108f5202858_640x480.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Image stitching is a pit-stop on my journey to learn how to do 3D reconstruction from 2D images, also known as Structure-from-Motion (SfM). I&#8217;m hoping this would familiarise myself to the common concepts and libraries before I attempt SfM which would undoubtedly be harder. </p><blockquote><p><em><strong>Image stitching</strong> or <strong>photo stitching</strong> is the process of combining multiple <a href="https://en.wikipedia.org/wiki/Photograph">photographic</a> <a href="https://en.wikipedia.org/wiki/Image">images</a> with overlapping fields of view to produce a segmented <a href="https://en.wikipedia.org/wiki/Panorama">panorama</a> or high-resolution image. </em></p><p><em><a href="https://en.wikipedia.org/wiki/Image_stitching">Wikipedia</a></em></p></blockquote><p>In this post, I&#8217;ll be going through my attempt at image stitching using Python and <a href="https://opencv.org/">OpenCV</a>. I also used the <a href="https://cmsc426.github.io/">University of Maryland CMSC246 Computer Vision</a> course and various tutorials (most notably <a href="https://www.opencvhelp.org/tutorials/advanced/image-stitching/">here</a> and <a href="https://www.geeksforgeeks.org/image-stitching-with-opencv/">here</a>) to guide my learning.</p><p>Since my aim is to learn the concepts, I did not use <a href="https://docs.opencv.org/4.x/d8/d19/tutorial_stitcher.html">ready-made functions</a> in OpenCV for this work. Neither did I go crazy and started to code everything using Numpy. Instead, I tried to strike a middle ground by using helper functions in OpenCV that would calculate key quantities like key features and homography matrices that would be used to build towards a stitched image.</p><p>As usual, I encourage the reader to look through the <a href="https://gitlab.com/thoughtsre/journey-to-sfm/-/tree/main/journey-to-sfm/02-image_stitching?ref_type=heads">code repo</a> to follow along. For the impatient reader, the key Jupyter notebook that we&#8217;ll be looking through is <a href="https://gitlab.com/thoughtsre/journey-to-sfm/-/blob/main/journey-to-sfm/02-image_stitching/03%20-%20Image%20Stitching%20-%20Fuji.ipynb?ref_type=heads">here</a>.</p><h1>Image stitching process</h1><p>The image stitching process involves at least 2 images. For every pair of images, I1 and I2, the following steps are performed to create a single image.</p><h2>Feature Detection</h2><p>The first step is to detect notable features in both images. A <a href="https://en.wikipedia.org/wiki/Feature_(computer_vision)">feature</a> in an image are things like corners and edges. The usefulness of the image feature would also depend on its surrounding pixels. Like the red dots shown in the picture below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sjqd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sjqd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 424w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 848w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 1272w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sjqd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png" width="515" height="372" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:372,&quot;width&quot;:515,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:251566,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sjqd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 424w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 848w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 1272w, https://substackcdn.com/image/fetch/$s_!Sjqd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe016e2fd-7b5a-43f9-a9e2-a0f0a8beb208_515x372.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>These are points which would be informative when determining whether an object exists in pictures taken from different perspectives.</p><p>In this work, I used the <a href="https://docs.opencv.org/4.x/d1/d89/tutorial_py_orb.html">ORB feature detector</a> in the <code>match_features</code> function. The ORB feature detector calculates the coordinates of the key features as well as <em>descriptors </em>for each feature. Descriptors are arrays of numbers that encode key characteristics of the feature, much like embeddings in the LLM-space.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FQqn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FQqn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 424w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 848w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 1272w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FQqn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png" width="488" height="104" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:104,&quot;width&quot;:488,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17784,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FQqn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 424w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 848w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 1272w, https://substackcdn.com/image/fetch/$s_!FQqn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583aa328-d159-4403-9be6-02e59d4f0b6e_488x104.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Feature Matching</h2><p>The next step is to match the features in both images to each other. That is to say for example, the corner of the table in picture 1 is the same as that in picture 2.</p><p>Take the picture of Lenna below. On the left, I have a cropped <em>and</em> rotated image. On the right I have the full image. The red dots are the features detected individually in the previous step. The green lines are the features that are matched to each other. As can be seen, the eyes are matched to each other and there&#8217;s a single point on the nose that is matched in both images.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U2VK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U2VK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 424w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 848w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 1272w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U2VK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png" width="515" height="399" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:399,&quot;width&quot;:515,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:273957,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U2VK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 424w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 848w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 1272w, https://substackcdn.com/image/fetch/$s_!U2VK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F487804eb-af36-48b2-9d2b-9a6e8d9640d0_515x399.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I used the <a href="https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html">FLANN feature matcher</a> in this work. There is also a small filtering step to filter away weak matches.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JWJS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JWJS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 424w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 848w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 1272w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JWJS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png" width="842" height="263" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:263,&quot;width&quot;:842,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40668,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JWJS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 424w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 848w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 1272w, https://substackcdn.com/image/fetch/$s_!JWJS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F392cc7be-abd6-474f-9596-b2d84aa11b80_842x263.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Homography Matrix Calculation</h2><p>Now we reach the crucial step, figuring how to transform one image to match the other by calculating the <a href="https://docs.opencv.org/4.x/d9/dab/tutorial_homography.html">homography matrix</a>, which describes how an image transforms from one camera&#8217;s perspective to another. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y8UL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y8UL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 424w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 848w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 1272w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y8UL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png" width="903" height="253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:253,&quot;width&quot;:903,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131181,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y8UL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 424w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 848w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 1272w, https://substackcdn.com/image/fetch/$s_!Y8UL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73d26275-741d-4650-b199-1e9db9abe8f6_903x253.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Taken from <a href="https://docs.opencv.org/4.x/d9/dab/tutorial_homography.html">OpenCV Tutorials</a></figcaption></figure></div><p>Seen below, the homography matrix <strong>H</strong> describes how a feature coordinate in I1, <em>(x, y)</em>, transforms to the coordinates in the I2, <em>(x&#8217;, y&#8217;)</em>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VI4X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VI4X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 424w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 848w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 1272w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VI4X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png" width="510" height="146.2" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:86,&quot;width&quot;:300,&quot;resizeWidth&quot;:510,&quot;bytes&quot;:5036,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VI4X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 424w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 848w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 1272w, https://substackcdn.com/image/fetch/$s_!VI4X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb15dcd6-31eb-4650-b4aa-b028862b323f_300x86.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Using the <code>findHomography</code> function in OpenCV, we find a homography matrix that best transforms all the matched features in I2 to the same coordinates in I1.</p><p>Say we have both images shown below, with the features matched. The mountain images are taken from the CMSC246 course.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tZJ1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tZJ1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 424w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 848w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 1272w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tZJ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png" width="1213" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1213,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1165725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tZJ1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 424w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 848w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 1272w, https://substackcdn.com/image/fetch/$s_!tZJ1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927cf85d-3bb8-4ff3-8a6b-13c79421460a_1213x824.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The homography matrix will help transform I2 to I1 as shown below. The steps to do so are in the next section.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gubq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gubq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 424w, https://substackcdn.com/image/fetch/$s_!gubq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 848w, https://substackcdn.com/image/fetch/$s_!gubq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 1272w, https://substackcdn.com/image/fetch/$s_!gubq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gubq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png" width="1085" height="921" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:921,&quot;width&quot;:1085,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:870274,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gubq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 424w, https://substackcdn.com/image/fetch/$s_!gubq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 848w, https://substackcdn.com/image/fetch/$s_!gubq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 1272w, https://substackcdn.com/image/fetch/$s_!gubq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb4a2641-6d93-4170-bf11-22903b7f07e7_1085x921.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Something to note here is that the matched key features should be well distributed across the entire image in order for the homography matrix to be well calculated. If the matched key features are concentrated in a small area in the image, then only the area where the features are found will be well transformed and the rest of the image might not.</p><h2>Image Merge</h2><p>Final step. Merge I1 and I2 into one image.</p><p>First, transform the corners of I2 using the homography matrix with the <code>perspectiveTransform</code> function. This would help figure out the final dimensions the merged image. As can be seen in the example above, the final image will be bigger than the individual images and I2 can end up higher/lower or to the left/right of I1, depending on the matched features. By transforming the corners of I2, the corners will end up with a coordinate relative to I1. See <code>merge_images</code> function.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lC69!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lC69!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 424w, https://substackcdn.com/image/fetch/$s_!lC69!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 848w, https://substackcdn.com/image/fetch/$s_!lC69!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 1272w, https://substackcdn.com/image/fetch/$s_!lC69!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lC69!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png" width="726" height="149" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41f3f370-5872-4599-9989-37abfc87b779_726x149.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:149,&quot;width&quot;:726,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29149,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lC69!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 424w, https://substackcdn.com/image/fetch/$s_!lC69!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 848w, https://substackcdn.com/image/fetch/$s_!lC69!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 1272w, https://substackcdn.com/image/fetch/$s_!lC69!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41f3f370-5872-4599-9989-37abfc87b779_726x149.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We then figure out the size of the final image. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jbCF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jbCF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 424w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 848w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 1272w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jbCF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png" width="726" height="326" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:326,&quot;width&quot;:726,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jbCF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 424w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 848w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 1272w, https://substackcdn.com/image/fetch/$s_!jbCF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675d73e5-916a-456a-a82d-a7959a354dc2_726x326.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The logic to get the final width, w and height, h, is shown below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JU_Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JU_Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 424w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 848w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 1272w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JU_Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png" width="578" height="660.8363636363637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1069,&quot;width&quot;:935,&quot;resizeWidth&quot;:578,&quot;bytes&quot;:85202,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JU_Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 424w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 848w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 1272w, https://substackcdn.com/image/fetch/$s_!JU_Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc2eb5de-e3ab-46b0-b4be-52ea59472fb5_935x1069.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I2 is then transformed using the <code>warpPerspective</code> function, where M is the homography matrix.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ETbN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ETbN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 424w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 848w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 1272w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ETbN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png" width="726" height="45" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:45,&quot;width&quot;:726,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8225,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ETbN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 424w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 848w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 1272w, https://substackcdn.com/image/fetch/$s_!ETbN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b034890-f9a3-4ca5-b3f0-99939690b5ba_726x45.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Note here that there is another matrix Mt that is multiplied to M. This is a translation homography matrix that shifts I2 without rotation to account for the fact that the top left corner of I2 is no longer <em>(0, 0)</em>. To see the effect of Mt, look at this <a href="https://gitlab.com/thoughtsre/journey-to-sfm/-/blob/main/journey-to-sfm/02-image_stitching/04%20-%20What%20does%20a%20translation%20homography%20matrix%20do.ipynb?ref_type=heads">notebook</a>.</p><p>The 2 images are then added together using a mask matrix.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1qUs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1qUs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 424w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 848w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 1272w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1qUs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png" width="677" height="45" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:45,&quot;width&quot;:677,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8308,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1qUs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 424w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 848w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 1272w, https://substackcdn.com/image/fetch/$s_!1qUs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8794009-c51a-4e18-b6b6-10052b8b34dc_677x45.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><h1>Overall algorithm</h1><p>The overall process to merge a set of images is as follows:</p><ol><li><p>Given a set of images, set one image as I1 and another as I2.</p></li><li><p>Merge I1 and I2 using the above process to result in I3.</p></li><li><p>Set I3 as I1. </p></li><li><p>Select another image as I2 and repeat steps 1 to 3 till all the images are merged</p></li></ol><h1>Final output</h1><p>I made two examples. One of a mountain panorama (taken from the CMSC246 course).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-9gy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-9gy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 424w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 848w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 1272w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-9gy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png" width="1108" height="902" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:902,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1125331,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-9gy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 424w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 848w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 1272w, https://substackcdn.com/image/fetch/$s_!-9gy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcca959b-720a-45bd-be43-81f5e5aad998_1108x902.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And another one of a picture of Mount Fuji by <a href="https://unsplash.com/photos/a-pagoda-with-a-mountain-in-the-background-gegr9bc2nbU">Danill K from Unsplash</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cuSI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cuSI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 424w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 848w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cuSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif" width="640" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:177692,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cuSI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 424w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 848w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!cuSI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89d26866-24cd-4ade-a64e-92b246d45755_640x480.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B5zj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B5zj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 424w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 848w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 1272w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B5zj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png" width="1119" height="893" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:893,&quot;width&quot;:1119,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1119976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B5zj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 424w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 848w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 1272w, https://substackcdn.com/image/fetch/$s_!B5zj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F481674ae-d26b-4eae-8460-1c77f0252ded_1119x893.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h1>Final remarks</h1><p>I was stumped for a while first by the step involving the calculation of the final dimensions of the merged image and the need to include Mt in the homography matrix before transforming I2. Took a while to visualise the operations and figure out why all my images were turning out gibberish.</p><p>The final outputs were also not perfect. You can see the edges of the constituents images and some edges are not as well aligned. The exposure across images is also not consistent. However to get all the details perfect would take a lot more. See the example of the openCV pipeline <a href="https://raw.githubusercontent.com/opencv/opencv/4.x/samples/python/stitching_detailed.py">here</a>.</p><p>OpenCV provides a <a href="https://docs.opencv.org/4.x/d8/d19/tutorial_stitcher.html">convenient class </a>that can perform image stitching easily. The output can be seen below. The output is much better.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KTG2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KTG2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 424w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 848w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 1272w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KTG2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png" width="305" height="418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:418,&quot;width&quot;:305,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185793,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KTG2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 424w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 848w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 1272w, https://substackcdn.com/image/fetch/$s_!KTG2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1f2cd6c-b906-4efe-97fb-8f0413e7a977_305x418.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nonetheless, this exercise has taught me about the key concepts behind image stitching. That&#8217;s enough for now. </p><p>Time to move on to structure-from-motion!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Experiment: Privacy Preserving Use of LLMs via LMStudio]]></title><description><![CDATA[Open-source LLMs with a slick chat interface. Most importantly, no sending of data to big tech.]]></description><link>https://blog.thoughtsre.com/p/experiment-privacy-preserving-use</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/experiment-privacy-preserving-use</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Fri, 22 Nov 2024 16:00:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Disclaimer: I am not paid by LMStudio or its affiliated companies. In fact I am not paid to write at all (I know, kind of sad&#8230;). This post records my experience of using LMStudio but may not be a rigorous test/review of all its functionalities/caabilities. </em></p><p>Large Language Models (LLMs) are changing the way people interact with information in an unprecedented fashion. The most famous example being <a href="https://chatgpt.com/">ChatGPT</a> by OpenAI. The ability to interact with information using natural language, to simply ask and get answers without having to learn an additional interface (keyboard, phone screen, website search bar) is almost magical at first contact.</p><p>This is also why big Tech like Google and Microsoft have both tried to capitalise on this by releasing their own LLMs. Countless startups have also sprung up in this Cambrian explosion of LLM-related innovation. Although the technology still has its problems such as hallucination and being extremely power hungry, innovation in AI and engineering is continually coming up with solutions to those challenges. It is quite clear, at least to me, that LLMs, or Generative AI in general, will have a huge impact in the way we live and work in the long run.</p><p>However, one thing still bothers me. And that is a lot of the mainstream LLMs (ChatGPT or <a href="https://ai.google/get-started/our-models/">Gemini</a>) are developed and hosted by big tech on <em>their</em> servers. Which means that everytime we interact with one of the big tech LLMs, we are sending data to the big tech servers. Regardless of what the companies say about protecting user data, I still feel that we should not just leave it to the big tech companies to protect our privacy. Especially as LLMs evolve to become part of a digital personal assistant that is with us 24/7 and knows every thing about our daily lives. Just think about how Google knows the whereabouts of all the people using Google Maps. Think about the power it gives Google.</p><p>The open-source community has risen up to this challenge and has released a plethora of very capable LLMs that are free to use. An example is <a href="https://www.llama.com/">Llama</a>, almost ironically released by Meta. And recently, I came across <a href="https://lmstudio.ai/">LMStudio</a>, an application that supports Windows (yuck), Mac and Linux (wooohoo!) and allows people to easily download open source LLMs for use <em>locally</em>, without sending data out of their computers. </p><p>So I decided to give it a go and share my experience here. </p><p>Sorry for the long intro. Let&#8217;s get on with the show.</p><h1>First impressions</h1><p>The user interace is very slick. And very familiar to any one who has used ChatGPT before. For those using Ubuntu, the app binary is in <code>.AppImage</code> format. If you find that you can&#8217;t run it by double-clicking on the binary, follow the <a href="https://linuxconfig.org/ubuntu-24-04-how-to-install-appimage">instructions here</a> to install fuse.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eVyp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eVyp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eVyp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:105657,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eVyp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!eVyp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18fbccb8-1731-4f93-a0ef-ad109917cf8a_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Model download</h1><p>This application even makes model download easy. Simply click on the magnifying glass on the left sidebar and you can select from a range of open-source models from HuggingFace.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P75e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P75e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!P75e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!P75e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!P75e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P75e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:281524,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P75e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!P75e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!P75e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!P75e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7295a22-ff06-4c2b-a0cd-7b4e3943b980_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For this experiment, I chose the <a href="https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF">Llama 3.2 1B model</a> that is 8-bit quantized. LMStudio uses <code>llama.cpp</code> as its underlying runtime which means it can run models in the <code>gguf </code>format and it does it really fast. Also, as the LLM community pushes performance limits, smaller models have actually become very capable for many tasks.</p><p>I will be running most of the examples below (save for the one where I run it on my 4 year old laptop) using my workstation that comes with an Nvidia RTX A4500 GPU. It automatically detected my GPU and used it. No fuss.</p><h1>Test 1: Simple conversation</h1><p>In this test, I simply conversed with the model by asking it a hypothetical question about Romeo and Juliet. In this case, the model is relying on its built-in knowledge. Note here that I am <em>not </em>evaluating the model performance, just the interface.</p><p>See a short recording of the interaction below. The speed you see is the actual speed of generation. I did not speed things up.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;ab407d58-fe5f-4392-a1ab-6c66842aad34&quot;,&quot;duration&quot;:null}"></div><p>Here&#8217;s an excerpt of the answer given to the question, &#8220;<em>Do you think Romeo and Juliet would end up together if their families weren't at war with each other?</em>&#8221;</p><blockquote><p><em>While the story as we know it is set in a time of conflict between the Montagues and Capulets, I think it's possible to imagine a scenario where Romeo and Juliet might still end up together even without the feud between their families.</em></p><p><em>In fact, there are many adaptations and reinterpretations of Romeo and Juliet that explore alternative scenarios, such as:</em></p><ol><li><p><em><strong>Romeo and Juliet 2.0</strong>: In this version, the families are not at war with each other, but they're more in harmony and respect each other's differences. The story explores how their families would adjust to living together peacefully.</em></p></li><li><p><em><strong>A utopian Romeo and Juliet</strong>: This interpretation takes place in a futuristic or alternate universe where the feud between the families is non-existent. In this world, Romeo and Juliet could still fall in love without any obstacles.</em></p></li><li><p><em><strong>Romeo and Juliet: A historical reimagining</strong>: This version might take place in ancient Rome, where the social norms are different from our own. Perhaps the families would have more flexible views on relationships and love, allowing for a peaceful coexistence between them.</em></p></li></ol><p><em>Even in these alternate scenarios, I think it's possible to imagine that Romeo and Juliet could still end up together if they chose to be in a relationship rather than being forced into one by societal pressures or family expectations. After all, their love is strong and true, and they're willing to overcome obstacles to be with each other.</em></p><p><em>&#8230;</em></p><p><em>Ultimately, the story of Romeo and Juliet is timeless, and its themes of love, passion, and loyalty continue to resonate with audiences today. While it's impossible to know for certain what would happen if their families weren't at war, I think it's possible that their love could thrive even without the obstacles they faced in the original version.</em></p></blockquote><h1>Test 2: Ask for up-to-date facts</h1><p>In this test, I asked for the the tourist attractions in Osaka as well as their websites. The prompt is as follows:</p><div class="pullquote"><p>What are the famous tourist attractions in Osaka? Give me the links to their websites as well.</p></div><p>Here&#8217;s the response:</p><blockquote><p><em>Osaka is a vibrant city with a rich cultural heritage, offering many exciting tourist attractions for visitors. Here are some of the most famous ones:</em></p><p><em><strong>Traditional Attractions:</strong></em></p><ol><li><p><em><strong>Osaka Castle</strong>&nbsp;(http://www.osakacastle.jp/)</em></p><ul><li><p><em>A 150-year-old castle that played a significant role in Japan's unification and is now a UNESCO World Heritage Site.</em></p></li></ul></li><li><p><em><strong>Dotonbori</strong>&nbsp;(http://dotonbori.info)</em></p><ul><li><p><em>A famous entertainment district known for its neon lights, street food, and unique attractions like the Glico Man sign.</em></p></li></ul></li><li><p><em><strong>Umeda Sky Building</strong>&nbsp;(http://www.umedaskybuilding.jp/en/index.php)</em></p><ul><li><p><em>The tallest building in Osaka, offering great views of the city from its observation deck.</em></p></li></ul></li></ol><p><em><strong>Food and Drink:</strong></em></p><ol><li><p><em><strong>Tsutenkaku Tower</strong>&nbsp;(http://www.tsutenkakutower.jp/)</em></p><ul><li><p><em>A 150-foot-tall tower with a large statue of the sun goddess Amaterasu, offering panoramic views of Osaka.</em></p></li></ul></li><li><p><em><strong>Dotonbori Riverwalk</strong>&nbsp;(http://www.dotenboririverwalk.com/)</em></p><ul><li><p><em>A scenic walking path along the Dotonbori River, featuring street food stalls and shops selling local specialties like takoyaki and okonomiyaki.</em></p></li></ul></li></ol><p><em><strong>Museums:</strong></em></p><ol><li><p><em><strong>Osaka Museum of History</strong>&nbsp;(http://museum.osaka.jp/eng/index.php)</em></p><ul><li><p><em>A museum showcasing Osaka's history, from prehistoric times to the present day.</em></p></li></ul></li><li><p><em><strong>National Museum of Art, Osaka</strong>&nbsp;(http://www.mhokuboku.go.jp/eng/)</em></p><ul><li><p><em>A world-class art museum featuring over 6,000 works of art.</em></p></li></ul></li></ol><p><em><strong>Other Attractions:</strong></em></p><ol><li><p><em><strong>Universal Studios Japan</strong>&nbsp;(http://usj-osaka.en.universal-studios.com/en/)</em></p><ul><li><p><em>An amusement park based on popular movies and TV shows like Harry Potter, Jurassic Park, and The Fast and the Furious.</em></p></li></ul></li><li><p><em><strong>Osaka Aquarium Kaiyukan</strong>&nbsp;(http://www.osakaoakari.kaiyukan.jp/en/index.php)</em></p><ul><li><p><em>A large aquarium featuring a variety of marine life from around the world.</em></p></li></ul></li></ol><p><em>Please note that some attractions may have restricted access due to maintenance or special events, so it's always best to check their websites before visiting. Additionally, Osaka is a bustling city with many other interesting attractions and experiences to explore beyond these famous ones!</em></p></blockquote><p>This is where the lack of an agentic approach where the LLM can invoke functions to crawl up-to-date information prevents an accurate response. For example, the <a href="https://osaka-info.jp/en/spot/national-museum-art-osaka/">National Museum of Art, Osaka</a>, has more than 8000 art pieces, not 6000. Also you would find that the URLs are hallucinations.</p><p>For thos not familiar with LLMs, all this is part and parcel of using LLMs and techniques like proper prompt engineering and retrieval augmented generation (RAG) have come a long way in overcoming such drawbacks. </p><p>Also, I&#8217;m using a very small model (by today&#8217;s standards), a larger model might work better but would not be a fair comparison when I try to run LMStudio on my old laptop later.</p><h1>Test 3: Read a document</h1><p>In this test, I uploaded my CV in PDF (yes, LMStudio supports document uploads) and asked the LLM the following question,</p><div class="pullquote"><p>My CV is attached. Looking at it? How would you describe my career? What do you think are my strengths? And what roles do you think I am suited for?</p></div><p>If the document can fit into the context window, LMStudio will parse it and place it in the LLM context.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_n7y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_n7y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 424w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 848w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 1272w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_n7y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png" width="834" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:834,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:100101,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_n7y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 424w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 848w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 1272w, https://substackcdn.com/image/fetch/$s_!_n7y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bc5d057-508d-420d-928b-9ad9795fd6f7_834x673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pretty cool&#8230;</p><h1>Test 4: RAG. Yes, RAG.</h1><p>LMStudio also supports <a href="https://huggingface.co/docs/transformers/en/model_doc/rag">Retrieval-augmented Generation (RAG)</a>. This means that you can upload large amounts of text that don&#8217;t fit into an LLM&#8217;s context window and have the LLM look up and summarise information from all the text data you upload.</p><p>To test this, I uploaded 13 of Shakespeare&#8217;s plays that I downloaded from the <a href="https://www.gutenberg.org/">Gutenberg Project</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QhgL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QhgL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 424w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 848w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 1272w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QhgL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png" width="834" height="673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b766165-6ae5-4230-8905-9b55df48c563_834x673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:673,&quot;width&quot;:834,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QhgL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 424w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 848w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 1272w, https://substackcdn.com/image/fetch/$s_!QhgL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b766165-6ae5-4230-8905-9b55df48c563_834x673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The thing is, LMStudio only supports uploading 5 documents at a time. So I had to batch my uploads. But once I&#8217;m done, I can ask questions across the stories I&#8217;ve uploaded.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bGG5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bGG5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 424w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 848w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 1272w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bGG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png" width="834" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:464,&quot;width&quot;:834,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82003,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bGG5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 424w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 848w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 1272w, https://substackcdn.com/image/fetch/$s_!bGG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb68bd34d-d14e-48d9-849f-6c7722df6ede_834x464.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It even gives me the citations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XYx4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XYx4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 424w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 848w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 1272w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XYx4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png" width="883" height="456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:456,&quot;width&quot;:883,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91190,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XYx4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 424w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 848w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 1272w, https://substackcdn.com/image/fetch/$s_!XYx4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80db6d79-b9c7-4ddd-ad30-785075bbf8c1_883x456.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Final test: Running it on my 4 year old laptop</h1><p>So now I&#8217;m running it on my 4 year old laptop (Lenovo X13 Gen1, Ryzen 7, 16GB RAM). This is to test out how it feels like on a developer laptop rather than a workstation.</p><p>As you can see in the screen recording below, the generation is a lot slower. 8.24 tokens/sec versus 200+ tokens/sec on my workstation. But the over feel is still OK, given the light-weight Llama3 1b model. Not too bad for a 4 year old laptop.</p><p>Of course, it might struggle with larger models. But with lighter models getting more capable each day, I&#8217;m staying hopeful.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;f9c0f52f-91c6-4d4d-a46d-289ffff34595&quot;,&quot;duration&quot;:null}"></div><h1>Other functionalities</h1><p>An interesting thing I found is that LMStudio also supports serving out LLMs via an OpenAI-like REST API. I didn&#8217;t fully explore this. But it&#8217;s interesting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z-TF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z-TF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 424w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 848w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 1272w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z-TF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png" width="1323" height="508" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:1323,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69932,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z-TF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 424w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 848w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 1272w, https://substackcdn.com/image/fetch/$s_!Z-TF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2957dad9-b99a-42b3-bf32-d460f566b4ab_1323x508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Final thoughts</h1><p>The entire experience of using LMStudio was extremely pleasant. The user interface was slick. Model download was seamless. Runtime was fast thanks to llama.cpp. </p><p>The setup was also painfree. I didn&#8217;t have to compile anything. I didn&#8217;t have to setup llama.cpp on my own, host a REST API and then code up a half-assed interface using Gradio (not that Gradio itself is half-assed, just my UI skills are not great). Everything just worked.</p><p><em>More importantly, none of the data left my workstation.</em></p><p>True, the fact that I can&#8217;t upload hundreds of documents can be a pain. And the lack of agentic approach can limit the use cases to summarisation, re-writing or ideation. But that&#8217;s OK. I mostly use LLMs at this moment for those use cases anyway. Maybe LLM agents that can query websites can be made available on LMStudio. Most people use LLMs in the chat format anyway.</p><p>On top of that, I have the whole open-source ecosystem to draw on. So for example, I can use a coding specific LLM to answer my coding questions. Maybe there will be a medical LLM to which I can direct my healthcare questions or ask to summarise my health check reports.</p><p>All-in-all, I&#8217;m very impressed with LMStudio and will continue to incorporate it into my daily work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[CodeWalkthrough: Bracket Pattern in Python]]></title><description><![CDATA[Safely deal with resources]]></description><link>https://blog.thoughtsre.com/p/codewalkthrough-bracket-pattern-in</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/codewalkthrough-bracket-pattern-in</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Fri, 15 Nov 2024 01:00:57 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/20d6d6e0-4cb4-4456-b249-6fff1c8b6176_999x999.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The <a href="https://typelevel.org/cats-effect/docs/2.x/typeclasses/bracket">bracket pattern</a> is something I first learnt from Cats Effect. It is used in the <a href="https://typelevel.org/cats-effect/docs/std/resource">Resource </a>typeclass so that you can ensure finalizing actions can be performed regardless of what happens. For example, a file needs to be closed after it has been read. The problem is, if an error occurs during the program, the file handle that was acquired may not be closed. The same can happen with database connections. The bracket pattern ensures that finalizer actions such as releasing the file handle will always be performed.</p><p>Such an effect can be achieved in Python using the <a href="https://docs.python.org/3/library/contextlib.html">context manager</a>. In fact, many objects in Python already implement this. An example is the file handle that you get when you open a file. This is why you can write <code>with open(&#8216;filename.txt&#8217;, &#8216;r&#8217;) as f</code>, and be sure that the file will be closed when you exit the <code>with</code> context.</p><p>More importantly, there will be times when you want to ensure certain custom finalizer action that is pertinent to your application is run for a custom resource or object. At that time, the bracket patterns comes in very handy.</p><p>In this article, I will go through an example of implementing the bracket pattern using the context manager. The code is <a href="https://gitlab.com/thoughtsre/bracket-pattern-in-python">here</a>.</p><h1>A custom resource object</h1><p>Suppose, you have a <a href="https://gitlab.com/thoughtsre/bracket-pattern-in-python/-/blob/main/bracket_pattern.py?ref_type=heads#L17">custom resource object</a> as follows:</p><pre><code>class Resource:
    def __init__(self, with_error = False):

        if with_error:

            raise ResourceAcquisitionError()
        
        return

    def doSomething(self):

        logging.info("Resource doing something!")
        logging.info("Performing some actions...")

    def doSomethingWithError(self):

        logging.error("Resource doing something that will throw an errors!")

        raise ResourceActionError()
    
    def release(self):

        logging.info("Releasing resource now.")
</code></pre><p>Here the resource object is a little contrived for demo purposes. During its initiation, if <code>with_error</code> is <code>True</code>, then an exception is raised to simulate a failure in acquiring the resource, such as <code>FileNotFoundException</code>. </p><p>There are then two methods: <code>doSomething</code> and <code>doSomethingWithError</code>, to simulate interacting with the resource and in the case of the latter, something goes wrong.</p><p>Lastly, there is a method to release the resource.</p><h1>The bracket pattern code</h1><p>The implementation of the bracket pattern uses the yield keyword. As such, control of the main program is passed from one function to another. To make it easier to follow, I&#8217;ll break the code down into parts.</p><p>First is the <a href="https://gitlab.com/thoughtsre/bracket-pattern-in-python/-/blob/main/bracket_pattern.py?ref_type=heads#L70">main program</a> interacting with the resource shown below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6P74!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6P74!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 424w, https://substackcdn.com/image/fetch/$s_!6P74!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 848w, https://substackcdn.com/image/fetch/$s_!6P74!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 1272w, https://substackcdn.com/image/fetch/$s_!6P74!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6P74!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png" width="1175" height="851" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:851,&quot;width&quot;:1175,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:333303,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6P74!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 424w, https://substackcdn.com/image/fetch/$s_!6P74!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 848w, https://substackcdn.com/image/fetch/$s_!6P74!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 1272w, https://substackcdn.com/image/fetch/$s_!6P74!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67388d5f-5a98-48b6-b230-c8f79d3c56cf_1175x851.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are three main parts labelled &#8220;1&#8221; for the acquisition of the resource, &#8220;2&#8221; for actions to perform when the acquisition fails and &#8220;3&#8221; for actually using the resource.</p><p>Next is the <a href="https://gitlab.com/thoughtsre/bracket-pattern-in-python/-/blob/main/bracket_pattern.py?ref_type=heads#L42">resource manager</a> which is decorated with the context manager from <code>contextlib</code>. This implements the bracket pattern. There are again 3 main parts labelled &#8220;4&#8221; to return the resource, &#8220;5&#8221; to handle any errors that arise from the use of the resource and &#8220;6&#8221; for performing the finalizing actions when everything is done.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UZUR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UZUR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 424w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 848w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 1272w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UZUR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png" width="542" height="522.3005893909627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:981,&quot;width&quot;:1018,&quot;resizeWidth&quot;:542,&quot;bytes&quot;:400440,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UZUR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 424w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 848w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 1272w, https://substackcdn.com/image/fetch/$s_!UZUR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2e400d9-da6e-4db2-81c8-0713e24524e0_1018x981.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Demonstrating the bracket pattern</h1><p>The <a href="https://gitlab.com/thoughtsre/bracket-pattern-in-python/-/blob/main/bracket_pattern.py?ref_type=heads#L108">main code</a> in the repo demonstrates the use of the bracket pattern three scenarios: the &#8220;happy path&#8221;, the case where something goes wrong when interacting with the resource and failure in resource acquistion. </p><p>In the &#8220;happy path&#8221; where everything goes smoothly. The resource gets acquired in &#8220;1&#8221;. Then using the resource manager the resource is yielded in &#8220;4&#8221;. This yields control from the <code>resourceManager</code> back to <code>program</code>. Then we interact with the resource in &#8220;3&#8221; before we exit the <code>resourceManager</code> context which then performs the finalizer actions in &#8220;6&#8221;.</p><p>So &#8220;1&#8221; &#8594; &#8220;4&#8221; &#8594; &#8220;3&#8221; &#8594; &#8220;6&#8221;. </p><p>The output is seen below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oPyL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oPyL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 424w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 848w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 1272w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oPyL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png" width="487" height="187" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:187,&quot;width&quot;:487,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11289,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oPyL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 424w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 848w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 1272w, https://substackcdn.com/image/fetch/$s_!oPyL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff51aabf0-31ba-49cd-bb97-9e7ad27c1281_487x187.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>In the case where something goes wrong while interacting with the resource, we again acquire the resource in &#8220;1&#8221;, enter the <code>resourceManager</code> which yields the resource in &#8220;4&#8221; but this time, an exception is raised in &#8220;3&#8221;. At this point, control goes back to the <code>resourceManager</code> and the error is handled in &#8220;5&#8221;. This is where you as the developer can perform custom actions like logging detailed information of the error. And as usual, we arrive at the finalizer actions in &#8220;6&#8221;.</p><p>The process is &#8220;1&#8221; &#8594; &#8220;4&#8221; &#8594; &#8220;3&#8221; &#8594; &#8220;5&#8221; &#8594; &#8220;6&#8221;.</p><p>The output is shown below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kcrS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kcrS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 424w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 848w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 1272w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kcrS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png" width="650" height="233" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:233,&quot;width&quot;:650,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29524,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kcrS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 424w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 848w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 1272w, https://substackcdn.com/image/fetch/$s_!kcrS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feda7dc6a-5a01-497a-9954-185c3a02491b_650x233.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Lastly, in the case where the resource fails to be acquired at all, we go from &#8220;1&#8221; to &#8220;2&#8221; (where we can implement other behaviours like retries or logging) and finally ending the program.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lXXB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lXXB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 424w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 848w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 1272w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lXXB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png" width="862" height="118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:118,&quot;width&quot;:862,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17202,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lXXB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 424w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 848w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 1272w, https://substackcdn.com/image/fetch/$s_!lXXB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3c46956-de15-425d-aa8a-bf79a0dfc046_862x118.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h1>Conclusion</h1><p>I hope that in this short post, you can see how with the use of try&#8230; except and context managers, we can ensure that resources and the exceptions raised in interacting with them can be safely handled. </p><p>This safety may not seem much in prototypes but will definitely be worth more than its weight in gold in production systems.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[CodeWalkthrough: Deep Learning in Scala using Storch and fs2]]></title><description><![CDATA[Functional deep learning on GPUs in Scala]]></description><link>https://blog.thoughtsre.com/p/codewalkthrough-deep-learning-in</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/codewalkthrough-deep-learning-in</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Fri, 08 Nov 2024 01:01:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iFL_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I fell in love with the expressiveness, type-safety and support for functional programming of Scala but the Scala ecosystem when it comes to Machine Learning (ML) is quite a bit behind that of Python (to be fair, it&#8217;s hard to find an ML ecosystem that can rival Python&#8217;s). So I was pleasantly surprised when I stumbled <a href="https://storch.dev">Storch</a>, a Scala deep learning library based on LibTorch with the ever-familiar PyTorch API. This allows me to potentially do my deep learning experiments in Scala.</p><p>Separately, I was inspired by <a href="https://www.youtube.com/watch?v=YSN__0VEsaw">a talk by Fabio Labella</a> that talked about using <a href="https://fs2.io/#/">fs2</a>, a library for coding &#8220;<em>functional, effectful, concurrent streams in Scala</em>&#8221;. This gave me an idea to try and apply an asynchronous, stream-based way of coding to the AI model training process. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In this article, I&#8217;ll walk you through my experiment on using fs2 and <a href="https://typelevel.org/cats-effect/">Cats Effect</a> to string together the various steps in a basic AI model training. I encourage you to <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch">clone the repo</a> to follow along.</p><h1>A little on fs2 and Cats Effect</h1><p>Cats effect and fs2 are two key libraries in the <a href="https://typelevel.org/">typelevel ecosystem</a> that enables functional programming in Scala. </p><p>Cats effect implements an asynchronous runtime in Scala via the <code>IO</code> object as well as typeclasses such as <a href="https://typelevel.org/cats-effect/docs/std/ref">Ref</a>, which we will be using later to maintain shared states.</p><p>fs2 is a stream library that is widely used to manage complicated data streams. It relies on a key benefits that often comes out of functional programming: compositionality, which allows a process to be broken down into parts and seamlessly combined. It also implements primitives such as <a href="https://fs2.io/#/concurrency-primitives">topics, queues and channels</a> for more complex ways of combining data streams.</p><p><em>Side note: Personally, fs2 is one of the most amazing libraries I&#8217;ve come across. It&#8217;s super powerful and expressive.  On top of that, the pull-based mechanism for streaming data allows constant-time processing, keeping compute resource usage manageable.</em></p><h1>Setup</h1><p>My aim here is to demonstrate the use of functional streams for deep learning with GPU on Scala. So I&#8217;ll be using an oldie-but-goodie, the MNIST digits classification, as a use case.</p><p>In this way, I can focus a little bit more on Storch and fs2.</p><p>Storch also conveniently provides an API to the MNIST dataset (like how PyTorch has an API for commonly used ML datasets) shown below (click <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch/-/blob/master/src/main/scala/main.scala?ref_type=heads#L147">here</a> for link to code).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UdUJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UdUJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 424w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 848w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 1272w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UdUJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png" width="600" height="93" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:93,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UdUJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 424w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 848w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 1272w, https://substackcdn.com/image/fetch/$s_!UdUJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd4752d1-6dae-424c-8ab0-3b0e2cbb8060_600x93.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>As mentioned, Storch implements the PyTorch API. The neural network used is a <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch/-/blob/master/src/main/scala/main.scala?ref_type=heads#L10">simple multilayer perceptron</a> seen below. The syntax should be familiar to any PyTorch user. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!11QB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!11QB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 424w, https://substackcdn.com/image/fetch/$s_!11QB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 848w, https://substackcdn.com/image/fetch/$s_!11QB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 1272w, https://substackcdn.com/image/fetch/$s_!11QB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!11QB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png" width="655" height="297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:297,&quot;width&quot;:655,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26076,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!11QB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 424w, https://substackcdn.com/image/fetch/$s_!11QB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 848w, https://substackcdn.com/image/fetch/$s_!11QB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 1272w, https://substackcdn.com/image/fetch/$s_!11QB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f4e6e02-f379-4723-a261-66e355e5faec_655x297.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Training step</h1><p>fs2 works on the concept of <a href="https://fs2.io/#/guide?id=building-streams">Streams</a>. For those who are not familiar, think of it as a river of data (or &#8220;1&#8221;s and &#8220;0&#8221;s) which keeps following until the program ends or the data is exhausted. Importantly, fs2 streams are <em>pull-based</em>. This means that data isn&#8217;t continuously fed from the input end but <em>pulled in</em> from the exit end of the stream.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iFL_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iFL_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 424w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 848w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 1272w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iFL_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png" width="538" height="488.7645376549094" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:953,&quot;width&quot;:1049,&quot;resizeWidth&quot;:538,&quot;bytes&quot;:156187,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iFL_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 424w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 848w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 1272w, https://substackcdn.com/image/fetch/$s_!iFL_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff27b10d6-3e92-4469-b413-61bd7b01a1cb_1049x953.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The nice thing about pull-based streams is that data is only delivered and computed just in time. Only when the consumer end requires the data will the data be delivered. Compare this to a push-based stream where data is delivered until either some queue is full or the consumer can no longer handle more computation, a pull-based stream  means that the resource usage can be kept manageable.</p><p>The training step of updating the model parameters by showing the model various batches of data can be seen below. The training data is delivered in batches via the training data stream on the left. Each batch is then put through the model update step which updates the model weights followed by the storing of the training statistics such as training cost. The AI model here is initiated as a shared object so that each batch of data updates the same model. The same goes for the training statistics. This proceeds until all the data in the training data stream is exhausted.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7-yg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7-yg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 424w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 848w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 1272w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7-yg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png" width="1331" height="751" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:751,&quot;width&quot;:1331,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:71465,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7-yg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 424w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 848w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 1272w, https://substackcdn.com/image/fetch/$s_!7-yg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23aa50f6-6aba-4100-ae4b-93e5fb51029c_1331x751.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The code for the above is <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch/-/blob/master/src/main/scala/main.scala?ref_type=heads#L83">here</a>. Note that the return type of <code>Pipe[IO, Sample, Unit]</code>. An fs2 pipe is a function that takes a stream of one type and transforms it into a stream of another type. In our case, <code>Pipe[IO, Sample, Unit] </code>is the same as <code>Stream[IO, Sample] =&gt; Stream[IO, Unit]</code>. It&#8217;s job is to perform the key model training steps for each batch of data. I won&#8217;t spend too much time dwelling on the syntax of fs2. Interested readers should read the documentation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y3Pi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 424w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 848w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 1272w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png" width="1430" height="698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:698,&quot;width&quot;:1430,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:395751,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 424w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 848w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 1272w, https://substackcdn.com/image/fetch/$s_!Y3Pi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F051b6c45-d7ad-42ea-9f57-3659cb2a7047_1430x698.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Taking a closer look, <code>trainPipe</code> takes in a <code>sampleStream</code> as input, this is the training data stream. The data is batched using <code>sliding</code> and <code>stackXY</code> reorganises each chunk of data into a tuple of inputs and targets. The key operations are in the <code>evalMap</code> section, where each batch of data is then mapped to the asynchronous function that updates the model and training statistics.</p><p>Note that the model (<code>model</code>) and training statistics (<code>stats</code>) such as are passed to <code>trainPipe</code> as <code>Ref</code>. Ref is a typeclass in Cats Effect that allows for shared states. This allows for the training and validation streams to refer to the same data objects.</p><h1>Validation step</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_GP0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_GP0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 424w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 848w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 1272w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_GP0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png" width="1321" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1321,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73163,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_GP0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 424w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 848w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 1272w, https://substackcdn.com/image/fetch/$s_!_GP0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7639e59-37fa-43a0-925e-aa56446a0a88_1321x756.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The validation step is coded similarly to the training step where there is a validation data stream fed to a pipe. See below (code <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch/-/blob/master/src/main/scala/main.scala?ref_type=heads#L106">here</a>).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9Nuw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9Nuw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 424w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 848w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 1272w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9Nuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png" width="1438" height="695" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:695,&quot;width&quot;:1438,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:403169,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9Nuw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 424w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 848w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 1272w, https://substackcdn.com/image/fetch/$s_!9Nuw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40e415c2-6e36-432c-90f3-6a372ad8d032_1438x695.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The key thing to note here is the fold method which accumulates all the results from the validation batches into a final validation cost and accuracy of the entire validation data stream. Since fs2 is built on Cats and Cats Effect, typeclasses like <a href="https://typelevel.org/cats/typeclasses/applicative.html">Applicative</a>, <a href="https://typelevel.org/cats/typeclasses/monad.html">Monad</a> and <a href="https://typelevel.org/cats/typeclasses/foldable.html">Foldable</a> are available. This enables semantics like <code>fold</code> and <code>flatMap</code> which opens up the expressiveness of the language.</p><h1>Putting it all together</h1><p>Now that we have each step of the training process in place, we need to put them together as follows.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4qLk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4qLk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 424w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 848w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 1272w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4qLk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png" width="1456" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62682,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4qLk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 424w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 848w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 1272w, https://substackcdn.com/image/fetch/$s_!4qLk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23e4afa6-69c7-437c-b392-bcf31b9684a4_1488x671.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After each training and validation epoch, we need to repeat the process for a number of times. Now this is where fs2 with its functional programming roots really shine. One of the key benefits of functional programming is <a href="https://en.wikipedia.org/wiki/Function_composition_(computer_science)#First-class_composition">compositionality</a>. I.e. we can create complex patterns by composing functions. This can be seen in the main run function <a href="https://gitlab.com/thoughtsre/deep-learning-in-storch/-/blob/master/src/main/scala/main.scala?ref_type=heads#L175">here</a>.</p><p>Taking a closer look, the loop is implemented simply as an append and repeat.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U2sD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U2sD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 424w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 848w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 1272w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U2sD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png" width="1456" height="273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:273,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:259798,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U2sD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 424w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 848w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 1272w, https://substackcdn.com/image/fetch/$s_!U2sD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c6dec1d-1922-497d-bbe9-d9504f21e169_2308x433.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I can safely do this because of the semantics made available by fs2. I didn&#8217;t have to do any boilerplate to manipulate types or additional control flow.</p><h1>One last thing&#8230; IOs&#8230;</h1><p>You would have notices that the data type <code>IO</code> throughout this demo. <code>IO</code>s are a basic higher order data type from Cats Effect that allows for asynchronous computing. In a nutshell, what I&#8217;m doing when I&#8217;m instantiating the functions in my program is I&#8217;m collecting a series of instructions to be run at a later time. Kind of how Apache Spark collects the data processing instructions into a Directed Acyclic Graph (DAG) to be lazily evaluated later.</p><p>For those coming from an interpreted language like Python, myself included, this can take a little getting used to.</p><h1>Final Thoughts</h1><h2>Out-of-memory datasets</h2><p>In this demo, I had the MNIST dataset loaded in memory since it is small enough. But you can read data from <a href="https://fs2.io/#/io?id=files">files</a> or <a href="https://typelevel.org/doobie/index.html">databases</a> in a stream-like fashion, keeping memory usage capped. </p><h2>Complex training patterns with fs2 primitives</h2><p>More complex model training patterns can also be enabled by fs2&#8217;s primitives. For example, training a model from various disparate data sources can be enabled with <a href="https://fs2.io/#/concurrency-primitives?id=queue">Queue</a>. Or training multiple models concurrently from a single data source can be done via publish-subscribe with <a href="https://fs2.io/#/concurrency-primitives?id=topic">Topics</a>.</p><h2>Scala ML Ecosystem </h2><p>While Scala has found widespread use on the data engineering scene (think <a href="https://spark.apache.org/">Apache Spark</a> and <a href="https://flink.apache.org/">Apache Flink</a>), libraries for numerical computing is generally lacking compared to Python. For example, Scala&#8217;s numpy equivalents like Breeze are not as actively maintained. Availability of graphing libraries like matplotlib for Python is also not very good. If I needed to do machine learning, Apache Spark is likely my best bet, but I don&#8217;t quite want to run a Spark clusters (even a local one) just to train a simple model.</p><p>Storch is a step in the right direction by introducing a Scala wrapper over LibTorch. It&#8217;s still not as fully featured as PyTorch but the support for tensors and autogradients offer an opportunity for numerical computation on both CPUs and GPUs.</p><p>Other nice things in the Scala ecosystem include:</p><ul><li><p>Support of <a href="https://github.com/vericast/spylon-kernel">Scala kernel</a> in Jupyter notebooks</p></li><li><p><a href="https://polynote.org/latest/">Polynote</a>, a Scala notebook which allows sharing data objects to Python for plotting</p></li></ul><p>One thing that I thought would be a good idea is a Scala API for the <a href="https://pola.rs/">Polars</a> dataframe library.</p><p></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Experiment: Trying out Metabase]]></title><description><![CDATA[Playing around with an alternative to PowerBI and Superset]]></description><link>https://blog.thoughtsre.com/p/experiment-trying-out-metabase</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/experiment-trying-out-metabase</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sat, 02 Nov 2024 07:49:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Disclaimer: I am not paid by metabase or any of its competitors. This is also not a full-blown rigorous review of metabase. I&#8217;m just playing around with it out of curiosity.</em></p><p>Metabase is what I would describe as a Business Intelligence (BI) and dashboard building tool that seeks to democratise the use and exploration of data. It is an alternative to well-known tools like <a href="https://www.microsoft.com/en-us/power-platform/products/power-bi">PowerBI</a>, <a href="https://www.tableau.com/">Tableau</a> and <a href="https://superset.apache.org/">Superset</a>. I came across it sometime back and decided to take it out for a quick spin, just because&#8230;</p><h1>Getting things up and running</h1><p>Metabase was pretty simple to get up and running in an local setup using Docker compose. This is provided in their <a href="https://www.metabase.com/docs/latest/installation-and-operation/running-metabase-on-docker#example-docker-compose-yaml-file">setup instructions</a>. Just use the Docker compose YAML below and you&#8217;ll be up and running in less than a minute (depending on your network speed of course). <em>Please note that this is not suited for production/enterprise setup.</em></p><pre><code>version: "3.9"
services:
metabase:
image: metabase/metabase:latest
container_name: metabase
hostname: metabase
volumes:
- /dev/urandom:/dev/random:ro
ports:
- 3000:3000
environment:
MB_DB_TYPE: postgres
MB_DB_DBNAME: metabaseappdb
MB_DB_PORT: 5432
MB_DB_USER: metabase
MB_DB_PASS: mysecretpassword
MB_DB_HOST: postgres
networks:
- metanet1
healthcheck:
test: curl --fail -I http://localhost:3000/api/health || exit 1
interval: 15s
timeout: 5s
retries: 5
postgres:
image: postgres:latest
container_name: postgres
hostname: postgres
environment:
POSTGRES_USER: metabase
POSTGRES_DB: metabaseappdb
POSTGRES_PASSWORD: mysecretpassword
networks:
- metanet1
networks:
metanet1:
driver: bridge </code></pre><p>The first thing once you navigate to <code>localhost:3000</code>, you&#8217;ll be greeted with a user creation screen. Just enter the details as you see fit, hit &#8220;Take me to Metabase&#8221; and you&#8217;re in!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jPBe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jPBe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 424w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 848w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 1272w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jPBe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png" width="336" height="386.17921146953404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:962,&quot;width&quot;:837,&quot;resizeWidth&quot;:336,&quot;bytes&quot;:42835,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jPBe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 424w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 848w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 1272w, https://substackcdn.com/image/fetch/$s_!jPBe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff28eb12a-143a-4d72-a0b5-4c4173f3e05f_837x962.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Connecting a database</h1><p>I decided I should try and connect a BiqQuery public dataset as my test data to test out how easy it is to connect to remote databases. For those of you who don&#8217;t know, Google BiqQuery hosts a series of <a href="https://cloud.google.com/bigquery/public-data">public datasets</a> for people to use. </p><p>In this experiment, I chose to use the Chicago Crime dataset. I started by first copying the dataset into my own Google Cloud project. This is pretty simple. Navigate to the dataset and click &#8220;Copy&#8221; on the top right and enter the required details (you can even set the dataset to expire after a certain number days so that you don&#8217;t keep it in your project indefinitely).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L5nv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L5nv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 424w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 848w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 1272w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L5nv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png" width="1456" height="722" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:722,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:234301,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L5nv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 424w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 848w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 1272w, https://substackcdn.com/image/fetch/$s_!L5nv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b822e85-9dc3-46f0-bef9-7e09b748f634_1919x951.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Next to allow metabase to reach the dataset in my Google Cloud project, I created a service account with the requisite privileges. See instructions <a href="https://www.metabase.com/docs/latest/databases/connections/bigquery">here</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nndp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nndp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 424w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 848w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nndp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png" width="1456" height="434" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:434,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79899,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nndp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 424w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 848w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Nndp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4350442f-a267-415c-af3f-d1eadd3a6c38_1901x567.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After which, I downloaded the JSON key for the service account and added it to Metabase in the &#8220;Add Database&#8221; page. As can be seen below, many other database connectors are available.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vmBQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vmBQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 424w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 848w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 1272w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vmBQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png" width="797" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39048,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vmBQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 424w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 848w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 1272w, https://substackcdn.com/image/fetch/$s_!vmBQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78f740cd-d568-432c-a62d-5bee6b5cc707_797x756.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Once that was done, I was able to preview the data immediately. Tada!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J4Ly!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J4Ly!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 424w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 848w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 1272w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J4Ly!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png" width="1456" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:192497,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J4Ly!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 424w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 848w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 1272w, https://substackcdn.com/image/fetch/$s_!J4Ly!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4101b76-a4d7-4eaf-aad8-fa7c8cbb22b6_1918x659.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Using Metabase</h1><p>The overall experience was pretty much similar to that of Superset. Once you have your dataset ready and connected, you are able to filter and summarise via a drag-and-drop UI (seen below) or via SQL statements.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oMjx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oMjx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 424w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 848w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 1272w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oMjx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png" width="1297" height="908" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:908,&quot;width&quot;:1297,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50268,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oMjx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 424w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 848w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 1272w, https://substackcdn.com/image/fetch/$s_!oMjx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29814022-a2dd-4dfb-9c54-bbe057d2f333_1297x908.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The thing I had to get used to was the idea of <a href="https://www.metabase.com/learn/metabase-basics/getting-started/ask-a-question">Questions</a> and <a href="https://www.metabase.com/learn/metabase-basics/getting-started/models">Models</a>. Questions are basically your SQL statements. Each question is coupled with a visualisation, of which there are many to choose from. Models are like your database views. They too can have a visualisation coupled with it. The key difference between a question and a model, as far as I can tell, is that models allow other questions to be built on top of it, which makes sense.</p><p>There&#8217;s also the idea of <a href="https://www.metabase.com/docs/latest/exploration-and-organization/collections">Collections</a> to group your questions and models. I guess, it&#8217;s to support different use cases on the same datasets.</p><p>Once you have all the visualisations read, you can put them all in a dashboard like below.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sdng!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sdng!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 424w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 848w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 1272w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sdng!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png" width="1456" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:374847,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sdng!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 424w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 848w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 1272w, https://substackcdn.com/image/fetch/$s_!Sdng!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95baac1d-70d1-429d-86ad-6035215354f0_1915x994.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>There were also other features like updating the data sync frequency and allowing people to subscribe to dashboard changes that I didn&#8217;t try.</p><h1>Final thoughts</h1><p>On the whole, I thought the user experience was rather pleasant and somewhat similar to what you would get in Superset or PowerBI. There was not much to complain about. The data onboarding was uneventful. The data refresh was pretty snappy (of course, it depends on the network and infra conditions you have. BigQuery&#8217;s pretty quick for most things). The organisation of the entire app was rather logical once you understand its concepts of questions and models. </p><p>I can also see Metabase&#8217;s utility to non-technical users with some knowledge of data modelling and manipulation. You would still need to understand group by and all that but you don&#8217;t need to code. </p><p>Although like any other BI tool out there, once the team gets big and you start to have many datasets, effort needs to be put in to maintain organisation and ensure that performance does not degrade. Even in the simple experiment I did here, BigQuery was queried several times with each question formulation. I can only imagine what that would do to on-premise databases with multiple teams trying to get their questions answered.</p><p>As to how such a tool will fit into the &#8220;ChatGPT-everything&#8221; age, I still feel that having drag-and-drop filters help and can be more efficient than simply using Large Language Models (LLMs) to go from question to answer. This is because it can be hard to express things in our heads to an LLM, whereas having the right buttons to click allows us to quickly narrow down the search field. That being said, LLMs can perhaps play a role in correcting our SQL statements or helping us discover other datasets within the organisation.</p><p>Lastly, as I was using Metabase to generate the visualisations, I can&#8217;t help but feel a little frustrated at the visualisation controls. I felt that I could have done things a lot faster in Python. Perhaps this is due to my training as a data scientist. Perhaps I&#8217;m not the intended audience for Metabase. But I thought it&#8217;d be great if there&#8217;s a way for me to run Python-based queries and create graphs using matplotlib or plotly. In this way, I can do things quickly and yet use the other functions of Metabase like dashboards or notifications.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[CodeWalkthrough: Sudoku in Scala]]></title><description><![CDATA[Coding my own Sudoku Generator and Solver]]></description><link>https://blog.thoughtsre.com/p/codewalkthrough-sudoku-in-scala</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/codewalkthrough-sudoku-in-scala</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Mon, 09 Sep 2024 15:30:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I love playing Sudoku. I must have played at least 1000 games over the years. Hence, it&#8217;s quite a wonder how I&#8217;ve thought of coding my own Sudoku generator and solver only recently. More than that, it is only when I started to think about how to code my own Sudoku that I realized that the game that I&#8217;ve almost taken for granted and have used to simply pass time, is <a href="https://arxiv.org/abs/1208.0370">much more complicated that it seems</a>.</p><p>So, here&#8217;s my attempt at coding my own Sudoku generator and solver. </p><p>The code repository is <a href="https://gitlab.com/bzai-public/sudoku-in-scala">here</a>. As usual, I recommend that the reader download the code and follow along. Links to the exact lines of code will be provided in the post.</p><h1>Nomenclature</h1><p>A Sudoku game is a 9x9 <em>grid</em>. Each <em>grid</em> is a 3x3 grid of <em>blocks</em>, which are themselves 3x3 grids of <em>cells</em>. Each cell can be empty or filled with the numbers &#8220;1&#8221; to &#8220;9&#8221;. In the code you will see later, a cell with a value of &#8220;0&#8221; means it is empty. The aim of the game is to fill the entire grid with the numbers &#8220;1&#8221; to &#8220;9&#8221; while ensuring that the numbers do not repeat in each row, column and block simultaneously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YU3V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YU3V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 424w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 848w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 1272w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YU3V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png" width="418" height="470.36093418259026" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1060,&quot;width&quot;:942,&quot;resizeWidth&quot;:418,&quot;bytes&quot;:38507,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YU3V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 424w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 848w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 1272w, https://substackcdn.com/image/fetch/$s_!YU3V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1bf4d89-3f08-4d04-b13c-97b71898beac_942x1060.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A Sudoku grid with empty cells waiting to be filled in is a Sudoku puzzle. The pre-filled numbers in a Sudoku puzzle are called clues.</p><h1>Some design choices in the code</h1><p>Let me first highlight some design choices I made so that it might be easier to understand why I coded the Sudoku solver and generator the way I did. The first design choice I made was to use <code>Vector </code>as the data structure to store the Sudoku grid. In Scala, <code>Vector</code> is an immutable data structure. This means that you can&#8217;t change it, you can only make a copy of it with an updated/new element. This is because, as you will see later, I will be storing the grid at different stages of the solution. Using an immutable data structure, gives me the piece of mind that I won&#8217;t have any weird behaviours that affect earlier states while editing later states.</p><p>Secondly, I approached the problem with Functional Programming. As a result, you will see that I will often return a new instance of an object after some action taken, instead of returning a modified object. This is also related to the immutability of Scala vectors where each update results in a new copy.</p><h1>Grid Trajectories</h1><p>Another thing that you will see in the code is what I call <em>grid</em> <em>trajectories.</em> (This is my own definition, not a conventionally known term.) Basically, the act of playing Sudoku is to move the grid from one state to another via the action of filling up a cell.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-YW-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-YW-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 424w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 848w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 1272w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-YW-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png" width="1456" height="354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101920,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-YW-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 424w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 848w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 1272w, https://substackcdn.com/image/fetch/$s_!-YW-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735f9384-409b-4dbe-9ef4-eb1b011022a8_2026x493.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>A grid trajectory is the history of all the states that the Sudoku grid has been through together with the actions that resulted in that state. The definition in code is <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L10">here</a>, as well as below. A <code>CellAction</code>, contains the row and the column of the cell being edited as well as the number that was used to fill the cell.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g5Mn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g5Mn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 424w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 848w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 1272w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g5Mn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png" width="447" height="135" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/271b279b-a099-4096-88fb-f9bec385bda3_447x135.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:135,&quot;width&quot;:447,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13401,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g5Mn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 424w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 848w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 1272w, https://substackcdn.com/image/fetch/$s_!g5Mn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F271b279b-a099-4096-88fb-f9bec385bda3_447x135.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This will become useful to implement <a href="https://en.wikipedia.org/wiki/Memoization">memoization</a> in the <a href="https://www.geeksforgeeks.org/backtracking-algorithms/">backtracking algorithm</a> which I will describe presently.</p><h1>Backtracking Algorithm</h1><p>The backtracking algorithm is a well known algorithm that is used when exploring a solution space where there are many valid solutions in a systematic way. Basically, you keep trying taking actions that are valid, transitioning the Sudoku grid from one state to another. If you reach an invalid state, you undo your previous action, revert the Sudoku grid into its previous state and select from the remaining valid actions from the previous state. You keep doing this until at least one solution is found.</p><p>To do this in code, usually you would use a recursive function that tracks the history of the states. An example can be seen in my implementation of the solver function. Note that the <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L274">main recursive function</a> takes a <code>GridTrajectory</code> that enables us to revert to previous states. See also the <code>backTrack</code><a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L137"> function</a> that takes in a <code>GridTrajectory</code> and returns a <code>GridTrajectory</code>.</p><p>Taking a closer look at how the backtracking algorithm progresses below, we start from &#8220;State 2&#8221; and say we select a cell action that progresses to &#8220;State 3&#8221;. After &#8220;State 3&#8221;, let&#8217;s say there is only one state left to which we can go. However, that state is an invalid state. In Sudoku, an invalid state is one in which there is at least one cell where there are no valid options. As a result, we can only move back to &#8220;State 3&#8221; (see &#8220;Backtrack 1&#8221;).</p><p>Since there are no more further states from &#8220;State 3&#8221;, we can only backtrack further to &#8220;State 2&#8221; (see &#8220;Backtrack 2&#8221;).  The only state left from &#8220;State 2&#8221; is also invalid, which means we end up in &#8220;State 2&#8221; again. Having no more options, we again backtrack, this time to &#8220;State 1&#8221; (see &#8220;Backtrack 5&#8221;).</p><p>We repeat this process until we are able to find a solution or we exhaust all possible actions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k9Iu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k9Iu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 424w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 848w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 1272w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k9Iu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif" width="1296" height="771" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:771,&quot;width&quot;:1296,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1857260,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k9Iu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 424w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 848w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 1272w, https://substackcdn.com/image/fetch/$s_!k9Iu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb1ad2cb2-cc05-40dd-8625-72eaca410f55_1296x771.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Sudoku Data Objects</h1><p>There are two main data objects in the code that will enable the interactions with the Sudoku grid.</p><h2>Cell</h2><p>The first data object is the <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L137">Cell</a>. It contains information about the row and column of the cell (0-indexed), the number it contains as well as the options that are available.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hizj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hizj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 424w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 848w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 1272w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hizj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png" width="749" height="30" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:30,&quot;width&quot;:749,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7005,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hizj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 424w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 848w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 1272w, https://substackcdn.com/image/fetch/$s_!Hizj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb3b0d1-ef26-4722-b687-bc8c841a163c_749x30.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The available options are derived based on the constraints that for the number &#8220;1&#8221; to &#8220;9&#8221; there should no number occurring more than once in each row, column and block. This is found using the <code>isValidCellValue</code> <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L72">function</a>.</p><pre><code>def isValidCellValue(row: Int, col: Int)(trialValue: Int): Boolean = {

            (getRow(row).filter(c =&gt; c.col != col).forall(c =&gt; c.number != trialValue)) &amp;&amp;
                (getCol(col).filter(c =&gt; c.row != row).forall(c =&gt; c.number != trialValue)) &amp;&amp;
                (getBlock(row, col).filter(c =&gt; (c.row != row) &amp;&amp; (c.col != col)).forall(c =&gt; c.number != trialValue))

        }
</code></pre><h2>Grid</h2><p>The <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L25">grid data object </a>contains all the methods used to interact with the Sudoku grid, such as getting/setting cells or checking whether a number is valid for a cell.</p><h1>Solver</h1><p>The <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L267">Solver</a> implements the backtracking algorithm using a recursive function. The main logic is <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L303">here</a> and shown below as well. As described before, if there are cells with no options (i.e. an invalid state) the Sudoku grid will be <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L137">backtracked</a> to the previous state otherwise, a <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L154">new number is added</a> to one of the empty cells.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uevk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uevk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 424w, https://substackcdn.com/image/fetch/$s_!uevk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 848w, https://substackcdn.com/image/fetch/$s_!uevk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 1272w, https://substackcdn.com/image/fetch/$s_!uevk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uevk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png" width="603" height="309" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:309,&quot;width&quot;:603,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31997,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uevk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 424w, https://substackcdn.com/image/fetch/$s_!uevk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 848w, https://substackcdn.com/image/fetch/$s_!uevk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 1272w, https://substackcdn.com/image/fetch/$s_!uevk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32f42115-6b1b-4023-b07a-fdc7ef55665e_603x309.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A note on the backtracking logic. The first state in <code>GridTrajectory</code> is the starting state of the Sudoku puzzle and the corresponding cell action is <code>None</code>. If so, an empty <code>GridTrajectory</code> is returned. Importantly, if there is a state to backtrack to, then not only is the number of the cell that is previously changed reverted back to zero, the number that was set is also removed from the options. For example, if setting a cell to the number &#8220;3&#8221; leads to an invalid state, then backtracking means that that cell is set back to &#8220;0&#8221; and importantly &#8220;3&#8221; is removed from the options available to the cell.</p><p>This is ensure that the option that leads to an invalid state will not be selected later.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!urAH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!urAH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 424w, https://substackcdn.com/image/fetch/$s_!urAH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 848w, https://substackcdn.com/image/fetch/$s_!urAH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 1272w, https://substackcdn.com/image/fetch/$s_!urAH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!urAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png" width="746" height="327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:327,&quot;width&quot;:746,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53375,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!urAH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 424w, https://substackcdn.com/image/fetch/$s_!urAH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 848w, https://substackcdn.com/image/fetch/$s_!urAH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 1272w, https://substackcdn.com/image/fetch/$s_!urAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9b36b26-d12d-4990-82d8-6cf872853bbd_746x327.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Lastly, the solver has 3 <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L261">solver modes</a>. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tNEv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tNEv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 424w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 848w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 1272w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tNEv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png" width="218" height="103" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:103,&quot;width&quot;:218,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5286,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tNEv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 424w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 848w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 1272w, https://substackcdn.com/image/fetch/$s_!tNEv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e6e66f8-ec7f-4ccd-950c-d3a3087cea91_218x103.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><code>SolveOne</code> is the mode that seeks to find the first solution available. <code>SolveAtLeastOne </code>is the mode that will continue to find more solution until more than one solution is found or the maximum iterations allowed is reached. This useful as can be seen later in the Generator section for testing whether there is a unique solution to a particular Sudoku puzzle. <code>SolveAll</code> as the name suggests will keep looking for as many solutions as maxiumum iterations will allow or when all possible moves are exhausted. The return logic for each of these states are shown <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L282">here</a>.</p><h1>Generator</h1><h2>Generating a full grid</h2><p>To create a Sudoku puzzle, we first start with a full grid where all the numbers are filled. This is generated using the <code>createFullGrid</code> function <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L174">here</a>. It fills the cells one by one in sequential order (instead of randomly selecting a cell each time). The backtracking algorithm is also used here to allow the generation to continue should an invalid state is reached.</p><p>The cells are filled in sequentially in row-major order because it allows the cell to be filled to take advantage of the information from numbers in the same row/column/block. If a random cell is selected each time, then the selected cell might be one where there are no other filled cells in the same row/column/block and no information to inform the choice of number to fill.</p><h2>Generating a Sudoku puzzle</h2><p>An important thing about a Sudoku puzzle is that it should have a unique solution. However this is not guaranteed if you simply remove a bunch cells from a full grid at random.</p><p>To ensure that the puzzle has a unique solution, the <code>createPuzzle</code> <a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L212">function</a> removes cells one-by-one. With each removal, the solver is used to solve the puzzle in <code>SolveAtLeastOne</code> mode. This ensures that the puzzle still has a unique solution with the removal of that cell (at least up to the maximum iterations limit). You keep repeating this remove-and-solve step until you reach the minimum number of clues desired (see the <code>minClues</code><a href="https://gitlab.com/bzai-public/sudoku-in-scala/-/blob/master/src/main/scala/Main.scala?ref_type=heads#L212"> parameter</a>) or if there is no other way to remove more cells.</p><p>Note that it is not guaranteed that the final puzzle generated will only have <code>minClues</code> clues. Also it is not always that the generation will succeed there is a certain level of randomness involved as the way the cells are removed may not result in a uniquely solvable puzzle. For example, although I had set <code>minClues</code> to 17, the final puzzle that resulted was one with 25 clues.</p><h1>Results</h1><p>The results of a sample generation of the Sudoku puzzle and its solution is shown below.</p><pre><code>Solving puzzle with 25 clues...
Correct answer found.
Puzzle
============
000 | 305 | 700
000 | 000 | 058
080 | 060 | 003
---------------
060 | 053 | 000
500 | 410 | 000
004 | 800 | 000
---------------
100 | 000 | 279
006 | 000 | 100
800 | 009 | 006
---------------
Solution
============
491 | 385 | 762
637 | 921 | 458
285 | 764 | 913
---------------
769 | 253 | 841
528 | 416 | 397
314 | 897 | 625
---------------
153 | 648 | 279
976 | 532 | 184
842 | 179 | 536
---------------</code></pre><h1>Conclusion</h1><p>It was fun learning about and implementing the backtracking algorithm as well as satisfying to finally peek behind the scenes of the game I&#8217;ve played so often. I also enjoyed being able to practice Functional Programming.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Experiment: Enabling WebLLM on Ubuntu 22.04]]></title><description><![CDATA[An interesting way of harnessing the power of LLMs]]></description><link>https://blog.thoughtsre.com/p/experiment-enabling-webllm-on-ubuntu</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/experiment-enabling-webllm-on-ubuntu</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Sun, 04 Aug 2024 14:11:23 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/86689c06-791b-46a2-8340-cd71d089d9e9_1000x1000.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Large Language Models (LLMs) promise to entirely change the way we interact with technology. It opens up a new interface between man and machine that is natural and intuitive. There are of course a lot of rough edges that still need to be sanded off for this technology to work seamlessly in our daily lives, much like any revolutionary technology in the past.</p><p>One of those &#8220;rough edges&#8221; is that LLMs being <em>large</em>, as its name suggests, are mostly being run by big tech companies such as OpenAI and Google on their massive server farms and exposed to the public via APIs. As a result, you would have to continuously send your queries/data to these companies in order to interact with their LLMs. This means that 1) there is a potential loss of data privacy and 2) a lot of network costs are incurred.</p><p>To be fair, the LLMs (or rather multimodal foundation models) are large for a reason and the big tech models often push the frontiers of what is possible. But what if you had a smaller use case? What if you wanted to incorporate it into a web portal? What if you were desperately concerned about your data privacy?</p><p>Well, two things work in your favour. Firstly, LLMs are getting increasing capable with new training techniques as well as over-training. As a result, a 2 billion parameter model like <a href="https://huggingface.co/google/gemma-2-2b">Gemma2 2-b</a> is pretty capable already. Secondly, there is a  framework called <a href="https://webllm.mlc.ai/">WebLLM</a> that uses <a href="https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API">WebGPU</a> to tap on your local GPU&#8217;s power to run LLMs <em><strong>locally</strong></em>!</p><p>In this post, I&#8217;ll be trying to enable it for myself on my desktop. Note that you do need a machine with a GPU. Tried it on my 4-5 year old laptop and didn&#8217;t work. I believe it should work on Macs with ARM-based chips, but I can&#8217;t be sure. If you&#8217;re still interested, read on&#8230;</p><h2>Pre-requisites</h2><p>You will need <a href="https://support.google.com/chrome/answer/95346?hl=en&amp;co=GENIE.Platform%3DDesktop&amp;oco=0">Chrome</a> for this if you are on Ubuntu. Firefox is not supported yet. See list of supported browsers <a href="https://caniuse.com/webgpu">here</a>.</p><h2>Step 1: Enable webgpu and vulkan support</h2><ol><li><p>Go to <code>chrome://flags</code></p></li><li><p>Search for <code>WebGPU</code> and enable both features </p></li><li><p>Search for <code>Vulkan</code> and just enable everything</p></li></ol><p>Took me a bit of searching to figure these steps out. Somehow people just assume that WebGPU should work out of the box, which didn&#8217;t for me. Kept running into errors of the browser not being able to acquire access to the GPU even though my machine is WebGPU-capable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wpts!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wpts!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 424w, https://substackcdn.com/image/fetch/$s_!wpts!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 848w, https://substackcdn.com/image/fetch/$s_!wpts!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 1272w, https://substackcdn.com/image/fetch/$s_!wpts!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wpts!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png" width="812" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:812,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58226,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wpts!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 424w, https://substackcdn.com/image/fetch/$s_!wpts!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 848w, https://substackcdn.com/image/fetch/$s_!wpts!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 1272w, https://substackcdn.com/image/fetch/$s_!wpts!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4528f484-1432-4cfe-87df-94f4d354db4c_812x590.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l-z_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l-z_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 424w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 848w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 1272w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l-z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png" width="790" height="609" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:609,&quot;width&quot;:790,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:49227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l-z_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 424w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 848w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 1272w, https://substackcdn.com/image/fetch/$s_!l-z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe57dca2d-cf27-4b4f-831e-21a8bf92bdd9_790x609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Step 2: Clone WebLLM demos</h2><p>Clone WebLLM git repo so that we can quickly test using the examples.</p><pre><code><code>git clone https://github.com/mlc-ai/web-llm.git</code></code></pre><h2>Step 3: Run a simple chat example</h2><p>Here we run the simplest chat example in <code>web-llm/examples/simple-chat-js</code> using Python.</p><pre><code>cd &lt;path_to_webllm_repo&gt;/examples/simple-chat-js &amp;&amp; python3 -m http.server 8080</code></pre><p>Navigate to <code>localhost:8080</code> in your Chrome browser.</p><h2>Step 4: Start playing!</h2><p>You will see a screen that looks like the one below. The first thing you do is you choose a model you want to load (there are many) and click &#8220;Download&#8221;.</p><p>This will download the model and load it onto your GPU. The first run takes a while to get started. Subsequent runs will use the cached model parameters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I6hj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I6hj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 424w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 848w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 1272w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I6hj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png" width="1060" height="743" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:743,&quot;width&quot;:1060,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:177608,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I6hj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 424w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 848w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 1272w, https://substackcdn.com/image/fetch/$s_!I6hj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4add368a-c713-46ce-bcdf-50eb9dfa73ab_1060x743.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As you can see, the model is indeed loaded into my GPU.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XU-m!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XU-m!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 424w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 848w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 1272w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XU-m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png" width="1351" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1351,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:656586,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XU-m!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 424w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 848w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 1272w, https://substackcdn.com/image/fetch/$s_!XU-m!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0473c7b2-eee2-4a3b-a736-464c8cfd3640_1351x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Final Thoughts</h2><p>I think it is a step in the right direction to enable running LLMs on browser for privacy preservation purposes. It also makes web application development much more light-weight as we don&#8217;t have to keep querying OpenAI/Google&#8217;s APIs incessantly. Instead we only use the big-tech models when we need some heavy multimodal firepower.</p><p>It is also nice that the WebLLM library can be CDN delivered. You just need to do <code>import * as webllm from "https://esm.run/@mlc-ai/web-llm";</code>, as mentioned on their <a href="https://github.com/mlc-ai/web-llm?tab=readme-ov-file#cdn-delivery">GitHub README</a>.</p><p>At the point of writing, the WebLLM library supports mainly chat completions, which might be useful for chat or summarisation use-cases. I would have love that embeddings are supported so that we can also think about RAG-based use-cases, which was why I was looking at this in the first place. I was thinking that it&#8217;d be great if I could have a simple backend with my document embeddings that can then be matched up with the query embeddings directly coming from the frontend. Apparently, <a href="https://github.com/mlc-ai/web-llm/issues/438">this feature is in the works</a>. So fingers crossed!</p><p>WebLLM is built on the very capable <a href="https://llm.mlc.ai/">MLC-LLM</a> framework which is a cross-platform LLM inference engine. I&#8217;m hoping to dig a little deeper into MLC-LLM.</p><p>Lastly, WebLLM still assumes that you have a GPU on your machine somehow. While this might become increasingly true in the near future (even our mobile phones have GPUs now), it is not completely true. Just like how my laptop would not have been able to utilise WebLLM. This means that developers will still have to find a way to cope with users without a GPU (maybe falling back to using OpenAI) or there might come a day where the AI paradigm shifts to enable really lightweight and capable models that can run fast enough of CPUs. </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[MathWalkthrough: Bellman's Principle of Optimality]]></title><description><![CDATA[If you know how to make good decisions, you need only to consider your next step.]]></description><link>https://blog.thoughtsre.com/p/mathwalkthrough-bellmans-principle</link><guid isPermaLink="false">https://blog.thoughtsre.com/p/mathwalkthrough-bellmans-principle</guid><dc:creator><![CDATA[Binghao Ng]]></dc:creator><pubDate>Tue, 30 Jul 2024 13:27:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Introduction</h1><p>The Bellman&#8217;s Principle of Optimality is an important idea in the study of sequential decision making processes. It has found its way into many areas from software algorithms to inventory management in the form of <a href="https://en.wikipedia.org/wiki/Dynamic_programming">dynamic programming</a> and <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a>.</p><p>In this post, I would like to describe the general idea behind the Bellman&#8217;s Principle of Optimality. I won&#8217;t go deeply into specific application of the principle but really try to take a look at its mathematical form and see what it means. I believe it will be easier to apply the principle once we can appreciate the general idea.</p><h1>Problem Setup</h1><p>Consider a decision problem over T time steps. At any given time t, our environment is described by a state x<sub>t</sub>. This &#8220;state&#8221; is simply the collection of &#8220;things&#8221; that tells us about the environment we are in. It could be temperature in the case of an oven. Or it could be the position, velocity and acceleration in the case of a car.</p><p>At each time step t, we have to make a decision d<sub>t</sub> given the state x<sub>t</sub> that would lead to an action a<sub>t</sub>.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;a_t = d_t(x_t)&quot;,&quot;id&quot;:&quot;MCMCIXLJMJ&quot;}" data-component-name="LatexBlockToDOM"></div><p>We can think of d<sub>t</sub> like a decision rule that tells us what to do when given situation described by x<sub>t</sub>. For example, when the traffic light turns red, hit the brakes.</p><p>And as a consequence of a<sub>t</sub>, our environment changes from x<sub>t</sub> to x<sub>t+1</sub>. Mathematically we can describe such a change using a transition function, &#934;, which describes how the state changes given an action.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;x_{t+1} = \\Phi(x_t, a_t)&quot;,&quot;id&quot;:&quot;PWTEIASJHI&quot;}" data-component-name="LatexBlockToDOM"></div><p>With our action, there  also comes a cost, c<sub>t</sub>(x<sub>t</sub>, a<sub>t</sub>).</p><p>As an example, when we step on the brakes while driving (action), the car slows down (state change) and we use more fuel during our journey (cost).</p><p>The overall process is shown below.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UGUi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UGUi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 424w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 848w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 1272w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UGUi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png" width="1456" height="329" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/acd653fa-2810-4951-b830-9424b6f234f6_2102x475.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:329,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96324,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UGUi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 424w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 848w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 1272w, https://substackcdn.com/image/fetch/$s_!UGUi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd653fa-2810-4951-b830-9424b6f234f6_2102x475.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Thus we can express the total cost of all our actions over T time steps as a sum of all the costs incurred at each time step, C, as follows,</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZpHO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZpHO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 424w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 848w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZpHO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png" width="410" height="122.49313186813187" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:435,&quot;width&quot;:1456,&quot;resizeWidth&quot;:410,&quot;bytes&quot;:114499,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZpHO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 424w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 848w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 1272w, https://substackcdn.com/image/fetch/$s_!ZpHO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e742dcb-2759-43cc-bca8-45521d96c147_1829x546.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Taking a closer look at the total cost expression, we can write it in two ways. First, in the forward time direction, where it is the sum of the cost at time 0 plus the total cost from t=1 to t=T. This is how we experience things, starting at the start time and progressing forwards. </p><p>The second way is backwards in time, where we have the total cost up to the second last time step plus the cost at the last time. This is more like the planning way of thinking where we start with the end in mind and work our way back.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u3Xk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u3Xk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 424w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 848w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u3Xk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png" width="1456" height="376" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f91eceea-48de-4140-b990-913b76bbdac1_1965x508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:376,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:86532,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u3Xk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 424w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 848w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Xk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff91eceea-48de-4140-b990-913b76bbdac1_1965x508.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In both cases, notice the term where the next state is dependent on the previous state. This introduces the dependency of the costs incurred from the actions taken over the course of the time horizon.</p><p>For example, taking a closer look at the second term for the forward direction, the cost from t=1 to t=T is a result of x<sub>1</sub> which in turn is the result of the action taken at t=0, a<sub>0</sub> at state x<sub>0</sub>.</p><p>Lastly, to introduce one last piece of math machinery, the collection of all the decision rules is called the policy, &#960;.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\pi = \\{d_0, d_1, \\ldots, d_T\\}&quot;,&quot;id&quot;:&quot;WWXPBONDIY&quot;}" data-component-name="LatexBlockToDOM"></div><p>Some call it a <em>strategy. </em>It is a way of thinking that informs us what decisions to take at each point in time under any given situation.</p><p>Now we are ready to start looking at the Bellman&#8217;s Principle of Optimality.</p><h1>Bellman&#8217;s Principle of Optimality</h1><p>Let me first quote the Bellman&#8217;s Principle of Optimality so that we can come back to it later.</p><blockquote><p><em>Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.</em> [Bellman1957]</p></blockquote><p>Our overall objective is to minimise the total cost over our time horizon T and is expressed below</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lQKf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lQKf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 424w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 848w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 1272w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lQKf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png" width="506" height="189.05494505494505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9936c637-2b91-4849-b518-39619264f679_1636x611.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:544,&quot;width&quot;:1456,&quot;resizeWidth&quot;:506,&quot;bytes&quot;:113012,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lQKf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 424w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 848w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 1272w, https://substackcdn.com/image/fetch/$s_!lQKf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9936c637-2b91-4849-b518-39619264f679_1636x611.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Now suppose we are able to arrive at an optimal policy, &#960;*, which gives us the best decisions rules that we can have such that the total cost is at its minimum. In other words, </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nK9P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nK9P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 424w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 848w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 1272w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nK9P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png" width="378" height="118.90384615384616" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:458,&quot;width&quot;:1456,&quot;resizeWidth&quot;:378,&quot;bytes&quot;:134745,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nK9P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 424w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 848w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 1272w, https://substackcdn.com/image/fetch/$s_!nK9P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28823de-6b9a-4e95-8034-0cc49e2a6191_1782x560.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zhh3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zhh3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 424w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 848w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 1272w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zhh3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png" width="464" height="98.47252747252747" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:309,&quot;width&quot;:1456,&quot;resizeWidth&quot;:464,&quot;bytes&quot;:130262,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zhh3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 424w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 848w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 1272w, https://substackcdn.com/image/fetch/$s_!zhh3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32b3a720-0b50-4e35-84bb-63428702e3ca_2169x460.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Expanding the total cost expression, we have</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N4WR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N4WR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 424w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 848w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 1272w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N4WR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png" width="606" height="115.28983516483517" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:277,&quot;width&quot;:1456,&quot;resizeWidth&quot;:606,&quot;bytes&quot;:120569,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N4WR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 424w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 848w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 1272w, https://substackcdn.com/image/fetch/$s_!N4WR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4dafedb-1fd0-449d-8025-560318bbbc9b_2290x436.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The above expression means that at t=0, we take an action that is optimal and leads us to state x<sub>1</sub> (first term) and follow the optimal policy thereafter (second term), we will arrive at a total cost that is the smallest it can be.</p><p>Now consider if we didn&#8217;t take the first action, a<sub>0</sub> according to the optimal policy, we would have</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C9Ps!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C9Ps!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 424w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 848w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 1272w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C9Ps!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png" width="580" height="112.73351648351648" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:283,&quot;width&quot;:1456,&quot;resizeWidth&quot;:580,&quot;bytes&quot;:124976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C9Ps!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 424w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 848w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 1272w, https://substackcdn.com/image/fetch/$s_!C9Ps!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020c10ad-5391-40e1-aa88-c0a7cc34d8b0_2266x441.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Of course, the total cost in this case would be greater or equal to the case where all actions were taken according to the optimal policy. But Bellman&#8217;s Principle of Optimality states that, by following the optimal policy after t=0, we would still arrive at the minimum cost <em>given the initial state and action</em>.</p><p>Two things to note here. Firstly, a policy is <em>not a sequence of states</em>. We do not specify the best path for the system to traverse through. Instead it is a sequence of decision rules (using our current nomenclature). That is why even if we didn&#8217;t make the optimal decision at t=0, our remaining policy decision rules will still be able to guide us to the best possible outcome <em>given that we made a &#8220;mistake&#8221; initially by not following the optimal policy</em>. Again, this does not mean that there is no cost to our mistake. The actual outcome would likely be worse than the optimal one just that it is the best that could have been achieved <em>given the circumstances</em>.</p><p>Secondly, the Bellman&#8217;s Principle of Optimality highlights the importance of knowing how to make good decisions and if we know how to make good decisions, then we only need to worry about the next step. This is because make the first good decision would set us on the path to get to a good state where we can make the next good decision to get us to the next good state and so on and so on.</p><p>The keen-eyed reader would notice at this point that while the Principle of Optimality is all well and good, it is not clear as to <em>how</em> one would arrive at an optimal policy. For that, I can only refer the reader to full length treatments by Sutton [Sutton2018] and Bertsekas [Bertsekas2020].</p><h1>Problem Variations and Additional Considerations</h1><p>In this article, I have used a simplified context to present the Principle of Optimality. There are variations and additional considerations that makes the Principle widely applicable in many situations. Here are some of them.</p><h2>Rewards instead of Costs</h2><p>In some cases, rewards are considered instead of costs. This is a simple modification to our treatment of the Principle here. Rewards are simply negative costs or vice versa. So in dealing with rewards we maximise instead of minimise. In books where rewards are used, you would see the total cost function being called the <em>value function</em>. See [Sutton2018].</p><h2>Continuous Actions</h2><p>If instead of discrete actions like start or stop, you have actions that can take continuous values such as how much to steer, then you would also approximate the decision rules as functions. In fact, you can approximate the entire policy as a policy function. See [Sutton2018].</p><h2>Continuous Time</h2><p>We have dealt with discrete time steps in this article. The time steps could be 1 second each, or 1 hour, 1 day, 1 month, doesn&#8217;t matter. But the shorter the time interval represented by each step the closer we come to considering things in continuous time where we would have to use the mathematical apparatus of <a href="https://en.wikipedia.org/wiki/Stochastic_differential_equation">stochastic differential equations</a> (SDEs). Interested readers might consider look at [Hassler2015] and [Oksendal2014].</p><h2>Stochasticity</h2><p>In this article, I have assumed that the transition from one state to another is deterministic. In other words, if a particular action is taken at a particular state, it will always lead to a known next state. This is not true in general. There is unpredictability present and the next state will be one of many possible states each with its own probability of occurrence. </p><p>To that end, the objective function stated above will have to be augmented with an expectatoin over all possible states. In other words, we will have to weight the costs of each possible next state according to their likelihood of occurrence.</p><p>This means that the optimal policy would give you the best outcome <em>on average</em>. But not that on any particular &#8220;run&#8221; or &#8220;episode&#8221; you are not guaranteed to always achieve the lowest cost or highest rewards.</p><p>I won&#8217;t dwell too much on this. Interested readers can refer to [Sutton2018] and [Bertsekas2020].</p><p>Books by R. Howard [Howard2007a, Howard2007b] also offer very good insights on how to treat non-deterministic state transitions in Markovian systems.</p><h1>Conclusion</h1><p>Bellman&#8217;s Principle of Optimality is an interesting concept that I picked up when I was into reinforcement learning and dynamical systems. I have found it applicable not just in the technical arena but also in life when thinking about decision making.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.thoughtsre.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading thoughtsre! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h1>References</h1><p>[Sutton2018] R.Sutton, A. Barto, 2018, <em>Reinforcement Learning: An Introduction</em></p><p>[Bertsekas2020] D. Bersekas, 2020, <em>Dynamic Programming and Optimal Control: Volume 1</em>, 4th edition</p><p>[Bellman1957] R. Bellman, 1957, <em>Dynamic Programming</em></p><p>[Hassler2015] U. Hassler 2015, <em>Stochastic Processes and Calculus: An elementary introduction with applications</em></p><p>[Oksendal2014] B. Oksendal, 2014, <em>Stochastic Differential Equations</em></p><p>[Howard2007a] R. Howard, 2007, <em>Dynamic Probabilistic Systems: Markov Models (Volume 1)</em></p><p>[Howard2007b] R. Howard, 2007, <em>Dynamic Probabilistic Systems: Semi-Markov and Decision Processes (Volume 2)</em></p><p></p><p></p>]]></content:encoded></item></channel></rss>