Talking with friends about startup ideas
In our community Tridev, we often find ourselves wishing we had certain software, or wistfully thinking about products that we could build. I wanted something that could tweet for me! I wanted a bot that could tweet about updates to my ongoing work that I was doing in code. Why? Because everyone talks about building a brand and how important that is. Complaining with other developers in my area helped me to realize just how much of a pain building a brand actually is. I don’t want to do it anymore. Now, there are some dystopian concerns that can come with letting AI do everything, but let’s let the next generation figure that out 😏.
DIY time.
Today, in the here and now, I no longer want to maintain my Twitter account. I’m a generative AI researcher, and Twitter has devolved into a bunch of LLMs talking to eachother anyway. Why not throw my own LLM into the mix? I know I want project updates to be tweeted about, so I had the idea. I’m writing anyway via git commits. These commits describe accurately what I’ve been up to and what I’m doing on a daily basis. LLMs are really good at translating text into other text, so why not give the LLM my commit and the code changes that I’ve made, and see what it comes up with? Also, let’s just post it to Twitter (and eventually other social platforms) while we’re at it!
Getting Started
A project is nothing in the beginning without a plan, so I set out to look at the major milestones that I’d need to accomplish my goal. I find it useful to think backwards from the target.
The goal: To automatically tweet engaging tweets about ongoing updates to my projects.
Break down your goal into all the parts that you can think of, and try to further break those parts down into their atomic elements.
Here are the different parts I can think of:
- Automatically tweet
- Engaging tweets
- Tweet about ongoing updates
Automated Tweeting
Something somewhere will have to tell my app to tweet. What could that something be? What I landed on was Git hooks and having those hooks POST messages back to some server. Git hooks allow you to run scripts after key events within a git repository. Since my idea had to do with tweeting project updates, to me the most natural first area of integration is the post-commit
hook. See more here: https://git-scm.com/book/ms/v2/Customizing-Git-Git-Hooks.
The script itself is super simple, and has minimal dependencies. The script is what I envisioned as the entrypoint to my system.
#!/bin/sh
git format-patch -n HEAD^ --stdout | curl -X POST \
http://localhost:8000/commit-msg -H 'Content-Type: text/plain' -d @-
After a commit, the script will generate a patch file on stdout, then POST that patch file to my running service endpoint on localhost.
It does require that git
and curl
are both installed, but this being a developer tool I know that most devs will have it. The script itself, which is how devs will interact with the system, allows for further exploration of how the system will operate at the system boundaries. In my case, an HTTP endpoint.
Engaging Tweets
The tweets also should be engaging, and this is where LLMs come in. My hunch in this project was that I’m already doing the writing in one place. I should use LLMs to be a force multipler, and have them rewrite my Git patches as engaging updates for multiple social platforms. LinkedIn, Twitter, Facebook, and really anything else with an API would be a pair (p, a)
where p
is the prompt for that platform and a
is the authentication scheme.
So, I know that I need LLMs integrated somehow, sitting behind the endpoint referenced in my post-commit
script.
Tweeting
Tweeting is the final part. I have the automation in place, and the LLMs. What good is rewriting my patches as tweets if I can never get the information to Twitter? Not good at all. Luckily I can borrow from work that I’ve done in other areas. I have a project https://github.com/jaketothepast/tweet-repl that I already mostly implemented auth for. So, I can just rip my twitter machinery out and put it into this new project LLM Social bot!
Clojure, Ring, and Building Things Fast
For this project, I used Clojure + RIng to implement the service layer. I intended to have the service running always on in the background, as part of systemd
or launchctl
in Mac, but I stick to just running the service in a terminal window with clj -A:run-m
for now.
Interface with LLMs
LLM handler defined through config with Integrant
In this project, I had the idea that the developer would only need to specify configuration to get things working. That’s all we do now as devs anyway right? Just write some YAML and you’re good. YAML and chill. But, Clojure has what I think of as configuration on steroids. We have EDN. EDN stands for Extensible Data Notation, and if you’re coming from outside of Clojure think of it as JSON with some buffs.
{:llm/handler
{:type :local
:url "https://huggingface.co/quantfactory/meta-llama-3-8b-instruct-gguf/resolve/main/meta-llama-3-8b-instruct.q2_k.gguf"
:local-dir "/users/jacobwindle/.models"
:n-ctx 8192}
:adapter/jetty {:port 8000 :join? false}}
There are some weird things in here, things that are very clojurey, like our clojure keywords (these guys :llm/handler
) and a lack of colons in the places that we’d expect. EDN is basically just a JSON object, and in EDN keys and values are space delimited. Cleaner syntax, cleaner values. I also find it very amazing that EDN is just Clojure datastructures. We have maps, we have vectors, and that’s all we need. I love that, and I know you will too.
So to use configuration to setup our system, I looked to a library called integrant
. https://github.com/weavejester/integrant, Integrant’s README has this in the first line:
Integrant is a Clojure (and ClojureScript) micro-framework for building applications with data-driven architecture. It can be thought of as an alternative to Component or Mount, and was inspired by Arachne and through work on Duct.
I think of it like this: If you have elements of your system that are stood up based on state like the config file, and need to be stopped/restarted as part of the application/development flow, Integrant is what you need. You define these stateful components of your app in a “system”, and Integrant will handle the startup, shutdown, and other actions that your system will need to take. I wanted to use Integrant to setup my LLM handler that will be in use duriing the application’s runtime, so I have the :llm/handler
key that defines config to use on start.
So the code to spin up my LLM handler is this:
(defmethod ig/init-key :llm/handler [_ {:keys [type key url local-dir] :as llm}]
(cond
(= type :openai) (openai/->ChatGPT key)
(= type :anthropic) (anthropic/->Claude key)
(= type :local) (local/config->Local url local-dir)))
And given the keys in my config.edn
file, Integrant will spin up the correct LLM handler for me and hold it in my system map at :llm/handler
. Whatever the return of ig/init-key
multimethods are is what gets stored in the final system map. Above this code is some more code that actually handles the binding of the system map using other integrant functions.
(def config
(ig/read-string (slurp "config.edn")))
(def system (atom nil))
;; ... some omitted code here
(defn -main
[& args]
(reset! system (ig/init config)))
So to startup my system, which will eventually call my :llm/handler
init-key
multimethod, I’m blowing away my system
atom and replacing it with the result of (ig/init config)
. At startup, this reads the config file (which is shown above), and will spin up my LLM handler based on the type.
Building LLM Handlers
So the end result of my system is to invoke the LLM handler spun up at init time in a Ring handler function.
(defn commit-message-handler
"Receive commit message as input, transform into tweet and post to social media"
[request]
(let [body-str (body-string request)
tweet-text ((:llm/handler @system) body-str)
tweet-response (twitter/tweet {:text tweet-text})]
(resp/response {:status tweet-text})))
;; ... omitted code ...
(def app
(ring/ring-handler
(ring/router
[["/commit-msg" {:post {:handler #'commit-message-handler}}]
["/post-commit" {:get {:handler #'post-commit-script}}]]
{:data {:middleware [json/wrap-json-response]}})))
The relevant bits are the tweet-text
binding where I call ((:llm/handler @system) body-str)
. As you can guess, this does the actual reformatting of the git patch into a tweet update that will eventually be posted. How is this happening though? First note that to get the handler I’m just doing a map
access. The key itself is a function and using it I can grab the value out of the system
map. Second, I used a technique that I learned of in the SciCloj mentoring program, representing your business logic as a single abstraction with protocols.
A quick detour into FastMath
As part of SciCloj, I was called upon to implement print multimethods for record types defined in the regression.clj
module used with machine learning.
(defrecord LMData [model intercept? offset? transformer
^RealMatrix xtxinv
^double intercept beta coefficients
offset weights residuals fitted df ^long observations
names
^double r-squared ^double adjusted-r-squared ^double sigma2 ^double sigma
^double tss ^double rss ^double regss ^double msreg
^double qt
^double f-statistic ^double p-value
ll analysis]
IFn
(invoke [m xs] (prot/predict m xs false))
(invoke [m xs stderr?] (prot/predict m xs stderr?))
Find this source code here: https://github.com/generateme/fastmath/blob/3.x/src/fastmath/ml/regression.clj#L41
This was my first encounter with records, and protocols. Generateme uses them to implement the different types of linear regression models, and he implements the IFn
protocol to make the model invokable. This means that if we instantiate it, it is callable
(def my-model (lm [1 2 3] [2 4 6])) ;; This function returns the record above
(my-model [1 2 3]) ;; it's now invokable as a function
And what are records/protocols anyway? Let’s see what clojure docs have to say.
This is why Clojure has always encouraged putting such information in maps, and that advice doesn’t change with datatypes. By using defrecord you get generically manipulable information, plus the added benefits of type-driven polymorphism, and the structural efficiencies of fields. OTOH, it makes no sense for a datatype that defines a collection like vector to have a default implementation of map, thus deftype is suitable for defining such programming constructs.
source: https://clojure.org/reference/datatypes#_why_have_both_deftype_and_defrecord
So I turned my model implementations into records, similar to what generateme did, but this time, with LLMs. I wanted invoke
to instead do inference and return that result back to the client.
The LLM Protocol and Record Design
In the design that I came up with, every model would implement the IFn
protocol in order to be invokable (and easily inserted into the first position of a function call), and would implement what I call the PromptProto
in order to make prompts used when invoking. Started with ChatGPT, and that looked like this:
(defrecord ChatGPT [key]
clojure.lang.IFn
(invoke [this commit-msg]
(get-response (api/create-chat-completion
{:model model
:messages (protocols/make-prompt this commit-msg)}
{:api-key key})))
protocols/PromptProto
(make-prompt [_ message]
[{:role "system" :content prompts/system-prompt}
{:role "user" :content message}]))
When implemting protocols for a record, you are creating a java class that adheres to that interface. IFn
is a special interface from Clojure that allows our record to be callable as a function. PromptProto
is my own custom protocol that involves making prompts and structuring the message that you want to send to the LLM.
(ns jaketothepast.llms.protocols)
(defprotocol PromptProto
(make-prompt [obj message]))
This gives me something to have all my LLM record types implement. make-prompt
just is the interface for making that message structure for LLM completions.
The local LLM handler
So I have my LLM record type, and as you can see above for ChatGPT
it’s instantiated with an API key and that’s it. Remember my integrant config and :llm/handler
init-key methods from before? They construct a record object with that key and hold it in the map. That’s easy, since the service is exposed as an API, that’s all that we need. What to do for a local LLM though?
I knew a local LLM was a must-have option, as a lot of developers are going to be privacy focused when it comes to their LLM usage. The app itself is meant to take in git patches and spit out tweets. Lots of companies aren’t going to be okay with sending their code changes off to OpenAI, Anthropic and the like. So to get around this, we need some offline options.
There are hints in my code sample from above, let’s put my config.edn
that I wrote down here to remind ourselves what is in it.
{:llm/handler
{:type :local
:url "https://huggingface.co/quantfactory/meta-llama-3-8b-instruct-gguf/resolve/main/meta-llama-3-8b-instruct.q2_k.gguf"
:local-dir "/users/jacobwindle/.models"
:n-ctx 8192}
:adapter/jetty {:port 8000 :join? false}}
So, this is a local type LLM according to my init-key
method. I landed on a few critical keys to support this type. First, a note, https://github.com/phronmophobic/llama.clj is an excellent library, and what I decided to use to power the local mode. This library uses llama.cpp under the hood, and allows me to run inference with any .gguf
file from Clojure. So, I settled on this to power my local mode, and built around it.
Downloading and Caching the Local Model Weights
The model weights aren’t going to just exist on disk (or maybe they are, just configure it correctly), so we need a mechanism to download and cache whatever we want to use. I do this with the :url
and :local-dir
keys found in the :llm/handler
block.
Model weight files are typically very very large, so we need to download them in the background in order to not freeze up our main thread and make our application wait. When I start the application and the integrant configuration is read, if it’s type :local then the model download/cache process begins.
(def local-llm-state (atom {:model-context nil
:model-location nil}))
;; ... omitted code ...
(defn- retrieve-model
[url local-dir]
(let [dir (io/file local-dir)]
(or (.exists dir) (.mkdirs dir)) ; Create the local directory, this didn't expand tilde
(let [model-location (download-model url local-dir)] ;; Gets a promise
(swap! local-llm-state assoc :model-location model-location))))
;; ... omitted code ...
(defn config->Local [url local-dir]
(retrieve-model url local-dir)
(->Local 8096))
So retrieve-model
runs right after the application starts up, which then tries to download the model and swaps that model into namespace state atom for usage later. But wait, download-model
would appear to be synchronous no? Wouldn’t that be a problem, causing the whole application to seize up and no longer accept any requests? Wouldn’t it at least be a burden?
Hint: I said it right there in the comment…
I promise I’ll download that model soon!
Clojure has a promise API! And with these basic concurrency primitives I can toss the work of downloading the model off to a background thread and let it work like nothing even happened. Here’s download-model
in it’s entirety:
(defn- download-model
"Either download the model, or return the path."
[url local-dir]
(let [filename (model-name url)
model-file (io/file local-dir filename)
file-write-promise (promise)]
(if (.exists model-file)
(deliver file-write-promise (.getCanonicalPath model-file)) ;; It exists, deliver the path
(future (do ;; It doesn't exist, download it on the background thread.
(when (not (.exists model-file))
(with-open [in (io/input-stream (io/as-url url))
out (io/output-stream model-file)]
(io/copy in out)))
(deliver file-write-promise (.getCanonicalPath model-file)))))
file-write-promise))
This code works heavily with the promise
and deliver
functions, allowing this work to run in the background. I start by binding a few variables, and also a promise via file-write-promise
, then we check to see if the model file already exists at that path. If it does, then we just deliver (or “resolve”) the promise right away with the full path to the model file. EZ-PZ. If it doesn’t already exist then we do some additional work.
Anything within the future
scope will happen on the background thread. So by just wrapping all those forms in a future
call I’m doing asynchronus work. Calling deliver
within the future with my file-write-promise
is where the magic happens. For any JS devs in the crowd that’s where the promise “resolves” and it resolves with whatever value I call the function with. Great, so what I am swapping into my state atom is a promise
, but that’s not very useful at runtime.
Lazily evaluating the model context
Well then remember when this download code actually runs? It runs when the application starts up. So, if you have specified a gigantic GGUF file then when the application starts (preferably at boot) and your model isn’t there, it will happily begin downloading the weights in the background while you wait.
Now say though, while it’s downloading those models you want to start with your first tweets and get things rolling. That’s where the remainder of the code comes in.
;; ... omitted ...
(defn- model-context
"When invoking the model, don't try to deref the promise until now. This gives it time to load in the background"
[n-ctx]
(let [model-location (:model-location @local-llm-state)
location (cond-> model-location
(utils/promise? model-location) deref) ;; Deref the promise from retrieve
context (:model-context @local-llm-state)]
(if (nil? context)
(swap! local-llm-state merge {:model-location location
:model-context (llama/create-context location {:n-ctx n-ctx})})) ;; TODO: n-ctx should be a local configuration
(:model-context @local-llm-state)))
;; ... omitted ...
(defrecord Local [n-ctx]
clojure.lang.IFn
(invoke [this commit-msg]
(llama/generate-string
(model-context n-ctx) ;; Here's the call
(protocols/make-prompt this commit-msg)))
protocols/PromptProto
(make-prompt [_ message]
(llama/chat-apply-template
(:model-context @local-llm-state)
[{:role "system"
:content prompts/system-prompt}
{:role "user"
:content message}])))
The model-context
function is called when using the Local LLM, and this will return a llama-context
object that we can use with llama.clj. The function itself will now try to deref
the promise value of the model location! It will only deref this though if the value is actually a promise. Now, if the model weights are still downloading, we will be stuck here waiting for this location promise to deref. Once done though, it will eventually swap in the deref’ed model location into the atom state, and return the context object.
Well, what’s left
Interact with Twitter
Formatting tweets is great, and definitely needed, but now that need to make their way to Twitter. Luckily, someone I know authored another tool called Tweet REPL that I could lift a lot of code from! So like any good developer I got to work stealing what they had done and appropriating it for my own goals.
(defonce twitter-env (:twitter (edn/read (java.io.PushbackReader. (io/reader "environment.edn")))))
I made the bizzare choice to include the Twitter keys in a local environment.edn file, rather than in my Integrant configuration. Honestly, I didn’t like it and felt a little guilty about it, but I couldn’t get the system to work correctly without circular require errors in the namespaces. So, stuck a pin in that and made my credentials just be in environment.edn
.
Now, there are some objects that are shared among method calls in the Twitter Java SDK. I have to prop those up with the right values in order to begin authenticating and making twitter API calls
(def creds
(let [{:keys [client-id client-secret]} twitter-env]
(doto (TwitterCredentialsOAuth2.
client-id
client-secret
(System/getenv "TWITTER_ACCESS_TOKEN")
(System/getenv "TWITTER_REFRESH_TOKEN")
true)))) ;; Is autorefresh token == true
(def service (TwitterOAuth20Service.
(.getTwitterOauth2ClientId creds)
(.getTwitterOAuth2ClientSecret creds)
"http://localhost:3000/login/success"
"offline.access tweet.read tweet.write users.read"))
The client ID and client secret are the most important in this scheme. At runtime, the System/getenv
calls are actually nil because we have nothing in those environment variables. How you persist your access and refresh tokens is up to you, but when you start you don’t need them. The service class will spin up with our creds, our auth redirect URL, and the twitter scopes that we want to utilize. You’ll see that the service class only makes use of our client and secret, which is again all we need to startup
Oauth2 Redirect Flow
So I was perusing the internet one day looking for a solution to this problem. I need the Oauth code that I get when the twitter auth page redirects after successful authorization. When a user declares that they do want my app to be given access to their tweets, then the app needs that code to complete the flow. Someone (as most internet denizens do) helpfully suggested in a StackOverflow post to spin up an embedded HTTP server, handle the redirect, then tear it back down after the code has been receieved. How should I do that in Clojure?
The only web server that I have any familiarity with is Jetty, so I’ll use that. How do I ensure that this web server is running in the background though, and that we can join on it until it’s done to grab the resulting code? I stuck with core.async
and channels for that.
In starting the auth flow, I’ll get an authorization URL from the twitter service usign some local PKCE object. Once we have that, since we know we want the server to do it’s thing in the background we swap a channel from core.async
into our namespace state atom. Channels are one succinct way to implemenet passing of messages back and forth between two asynchronous pieces of code in Clojure.
From the chan
docs at https://clojuredocs.org/clojure.core.async/chan:
Creates a channel with an optional buffer, an optional transducer (like (map f), (filter p) etc or a composition thereof), and an optional exception-handler. If buf-or-n is a number, will create and use a fixed buffer of that size. If a transducer is supplied a buffer must be specified. ex-handler must be a fn of one argument - if an exception occurs during transformation it will be called with the Throwable as an argument, and any non-nil return value will be placed in the channel.
So I’ll use the channel to receive the oauth code that the embedded Jetty server extracted on redirect. Once we have that oauth code, we can grab the access token and complete the oauth flow. That code looks like this:
(defn handler-twitter-code [req]
(let [server-chan (:chan @twitter-state)
{:strs [code]} (:params req)]
(>!! server-chan code)
{:status 200
:headers {"Content-Type" "text/html"}
:body "Hello world"}))
(defn start-embedded-auth-server [handler]
(let [app (wrap-params handler)]
(jetty/run-jetty app {:port 3000 :join? false})))
(defn get-oauth2-credentials []
(let [state "state"
pkce (doto (PKCE.)
(.setCodeChallenge "challenge")
(.setCodeChallengeMethod PKCECodeChallengeMethod/PLAIN)
(.setCodeVerifier "challenge"))
url (.getAuthorizationUrl service pkce state)]
(swap! twitter-state assoc :chan (chan)) ;; Add our channel to the twitter-state
(sh/sh "open" url)
(let [server (start-embedded-auth-server handler-twitter-code)
code (<!! (:chan @twitter-state))]
(.stop server)
(.getAccessToken service pkce code))))
(defn authorize-app []
(let [oauth2-access-token (get-oauth2-credentials)
access-token (.getAccessToken oauth2-access-token)
refresh-token (.getRefreshToken oauth2-access-token)]
(swap! twitter-state assoc :api
(TwitterApi. (doto creds
(.setTwitterOauth2AccessToken access-token)
(.setTwitterOauth2RefreshToken refresh-token))))))
Starting with authorize-app
, we bgin our oauth flow, doing all the steps I’ve described. Once complete, we take our newly minted TwitterApi.
object and swap it into our namespace state. We need this object to perform any subsequent Twitter calls like posting tweets.
Putting it all Together
Now, to finish the job, I have to do two things. First, I need to install my git hook for post-commit
earlier into one of my existing repositories. Next, I need to run my service in the background. Once those two things are complete, I just commit and tweet, that’s it!
clj -A:run-m
Thanks to everyone in #clojurians that makes this such a fun language to work with, and to all that answered my questions. I have some bigger projects planned and hope to get started soon!