Boolean satisfiability is a classic problem in computer science. Given a series of n boolean variables, A B C ... and a formula in 3-conjunctive normal form
CDCL is a complete and sound method, so the canonical solver line is also the number of solvable instances.
"Abstract Nonsense" is a somewhat loving, but somewhat derisive term for methods (typically
Category Theoretic methods) in pure mathematics that are unreasonably convoluted and involve a lot of theoretical machinery.
I myself am awful at Category theory but excellent at abstract nonsense, and I wanted a space to share my thoughts
and projects. I'm well aware that very few people will read this blog, but to me this space is a journal. A respite
from the giants that control the web, and a space to share my thoughts into the void, in a way I can control and moderate.
More concretely, I hope to maintain "Abstract Nonsense" as a dev log as sorts. Not because I think it showcases phenomenal
technical talent, but because it showcases some of the cool things I've been learning on the side.
I'll keep my first entry on this journal quite short. This entry stands well on its own.
Because it does something the category theorist in all of our hearts would love.
It's self referential.
The content engine that runs Abstract nonsense is quite brilliant if I do say so myself.
It is a python script tha takes in a series of html files, and agglomerates them into a single file.
In addition to the abstract nonsense engine I have two other python scripts that form the backbone of this (static)
website. I have a script that takes in plaintext of a quote document I have been personally maintaining for the
past 3 years. It uses regular expressions to parse out the quotes and build an html file that contains java-script
that builds a dynamic webpage this java script program alters the html on the page to create a typing effect.
Check it out
here! The final piece of this beautiful infrastructure is a third script that runs both scripts than commits the whole branch to master.
As I learned on Twitter/Reddit/The Quote Document:
"Everybody has a testing environment. Some people are lucky enough enough to have a totally separate environment to run production in." - @stahnma
Abstract nonsense and this website as a whole is both test and prod. Maybe one day, I'll be a good enough engineer
to be able to invest in a test and prod for my website.
If you have a fixed budget or you want to tweak the numbers to see what would need to change to meet certain financial goals try out the optimizer. The optimizer uses the bisect method to find some input which meets a certain goal. For example, say you have 100,000$ and you want to figure out how much you can spend on a house, the optimizer will help you budget.
Right now this product is in a tech-demo stage. Short-term, there are two features that we plan to build out relatively shortly.
This is a weird blog entry. The end goal of this project is to eventually run a 100 mile Backyard Ultramarathon using only free software This rule is to be interpreted as reasonably as possible and should only apply to tech worn or carried through the race. This rule also does not apply to any crew. This rule is to be followed in spirit. For example if unavoidable small bits of nonfree cpu microcode are acceptable or in modem firmware, care will be taken to isolate such components. An artifact of this is any music listened to during the race will be DRM-Free. The patent on mp3s has recently expired so it is free. I've been a pretty rubbish runner for most of my childhood so this project is technically and physically grueling.
Distance | P95 | P99 | PR |
---|---|---|---|
Mile | 6:00 | 5:20 | 5:47* |
1.5 mi | 9:10 | 8:22 | 8:58* |
5k | 21:12 | 17:38 | 19:47* |
10k | 41:17 | 34:24 | 44:42 |
Half Marathon | 1:33:04 | 1:18:07 | 1:38:13 |
Marathon | 3:08:42 | 2:44:18 | 3:58:05 |
50k | ??? | ??? | 4:56:35 |
24 hr Backyard ultra ruleset run distance | 50 mi | 100 mi | 34.4 mi |
This "blog" is called "Abstract Nonsense" because of this project. Most language models try to build interesting
output, but end up spouting abstract nonsense (with or without some semantic correctness). Well,
I thought to myself, I have a corpus that itself is really just abstract nonsense, maybe I could train an NLP
transformer model on this corpus, and oddities of syntax, would actually be a feature!
Because the robot is confused, it will also be named Abstract Nonsense, to maximize perplexity with respect to the
identically named blog hosted on this site
I present to you GPT 9001! Which is really just a fine tuned version of GPT 2 tuned for text generation on
the Quote Doc In this project I learned that
hand-rolled models that I can quickly train are trash. For example, the first implementation of GPT 9001, was called GPT0,
and was just some LSTM model I spun up and trained on the quote doc, the LSTM model could either predict random
words or overfit the training set. It couldn't do anything of interest :(.
Anyway, without further ado here s/he is:
This is the official page for tracking my (Rohan Jhunjhunwala's) progress and funds raised during his Backyard Ultramarathon for Bidya.
Bidya is an organization which raises money to create scholarships for underprivileged females in Nepal.
With as little as 1500-2500$ in capital we can create a scholarship fund which can fund all annual K-12 educational
expenses for one Female student in need using the interest returns alone.
Because of the comparatively low costs in Nepal, and our relationships with certain high-schools,
we can ensure that every dollar Bidya fundraises will efficiently go to a child in need.
Donations to Bidya are 100% 501c(3) tax deductible. Rohan and Rakchhya are matching all
donations made to this Ultramarathon fundraiser. For every dollar donated, 3$ will go to funding
critical scholarships for underprivileged female Nepali students. To make a per-mile pledge reach out to
rjhunjhunwala80@berkeley.edu or feel free to donate directly.
This is a race without a finish line. Competitors run 4.17 miles every hour on the hour until there is just one athlete left standing able to complete the loop.
Any pledge, big or small, makes a big difference both to our scholars and to me when I'm exhausted out
on the course hoping to be the last man standing. As a thank you to our patrons. I will update this page live during the race, Saturday, June 1.
The javascript graph below will track the total funds raised so far.
The race also has an official Facebook Leaderboard.
If you'd like to drop some cheers in the comments there, we'd be really appreciative. Special thanks to all supporters and my team
that is coming out to this event.
This update is a quick one.
I learned that this floofer needed some head pats, and I had to help!
This is an important cause, so feel free to compile and run the following java script (not javascript fortunately) to help out the floofer.
I came across an exciting problem in a stand up maths video.
Using a simple brute force graph-theory argument, a viewer had gotten the runtime of Matt's solution down from 32 days to 15 minutes. By throwing the book at the problem I thought we could do better.
With good fundamental knowledge of the english language and machine structures other authors
have implemented some variant of exhaustive search with substantially better runtime of 100 milliseconds. With fundamentally simple code.
However, using Mixed integer programming, ( a simple formulation around set covering) we can optimize this to 10 seconds (360 * 24 * 32) times faster than Matt's code.
An alternate approach using pySMT (satisfiability modulo theories) has an estimated runtime of around 11 hours. My formulation of this as a SMT problem finds one of the 11 assignments of words for this problem
1 hour.
These are overkill solutions but have a certain mathematical elegance to them, that makes me really happy.