This "blog" is called "Abstract Nonsense" because of this project. Most language models try to build interesting
output, but end up spouting abstract nonsense (with or without some semantic correctness). Well,
I thought to myself, I have a corpus that itself is really just abstract nonsense, maybe I could train an NLP
transformer model on this corpus, and oddities of syntax, would actually be a feature!
Because the robot is confused, it will also be named Abstract Nonsense, to maximize perplexity with respect to the
identically named blog hosted on this site
I present to you GPT 9001! Which is really just a fine tuned version of GPT 2 tuned for text generation on
the Quote Doc In this project I learned that
hand-rolled models that I can quickly train are trash. For example, the first implementation of GPT 9001, was called GPT0,
and was just some LSTM model I spun up and trained on the quote doc, the LSTM model could either predict random
words or overfit the training set. It couldn't do anything of interest :(.
Anyway, without further ado here s/he is:
If you have a fixed budget or you want to tweak the numbers to see what would need to change to meet certain financial goals try out the optimizer. The optimizer uses the bisect method to find some input which meets a certain goal. For example, say you have 100,000$ and you want to figure out how much you can spend on a house, the optimizer will help you budget.
Right now this product is in a tech-demo stage. Short-term, there are two features that we plan to build out relatively shortly.
A quine is a computer program that
produces its own source code as an output. This is trivial to do in HTML by just not using any tags (or newlines)
which demonstrates the uselessness brilliance of the modern web stack.
This is a little less trivial if you use javascript, and you do not allow any reflective features
(interact with the dom object to access source), and you do not allow any file IO. It is even more annoying if you
want to add an article describing the techniques being used in your quine and have your quine explain itself
before outputting itself, and surround the quine in the tags needed to format it. That being said.
If you could not tell, that is what this page is, and let us describe the techniques we use.
A high
level description of this quine is as follows. First, declare a string called code. Inside of this
variable called code, there is a string representation of javascript code, which, if run, would first print
this article describing the quine, and how it is made, then would construct create a string that contains
the content of the source code, (calling JSON.stringify to escape the code variable where appropriate!)
and then use this string to print out its own source formatted. Then we use the
security vulnerability dynamic code execution feature called eval, to run the string called
code as if it were actual javascript code and not a string!
Along the way you learn a lot of fun things,
like while none of the characters in a script tag is invalid in a javascript or needs escaping, if you have such a
tag in a JS string, you are just out of luck, and need to split it up and use string concatenation
And, to make things more fun, you can not document.write a script tag!
The following formatted block
contains the source used to output this web page: This blog contains other pages, and other files contain CSS,
and there is a script that assembles this blog from a bunch of smaller files, but if you want you can copy
and paste the next code block into a .HTML file and open it in
a browser to see this exact blog post (minus the pretty CSS formatting)!
This update is a quick one.
I learned that this floofer needed some head pats, and I had to help!
This is an important cause, so feel free to compile and run the following java script (not javascript fortunately) to help out the floofer.
This is a weird blog entry. The end goal of this project is to eventually run a 100 mile Backyard Ultramarathon using only free software This rule is to be interpreted as reasonably as possible and should only apply to tech worn or carried through the race. This rule also does not apply to any crew. This rule is to be followed in spirit. For example if unavoidable small bits of nonfree cpu microcode are acceptable or in modem firmware, care will be taken to isolate such components. An artifact of this is any music listened to during the race will be DRM-Free. The patent on mp3s has recently expired so it is free. I've been a pretty rubbish runner for most of my childhood so this project is technically and physically grueling.
Distance | P95 | P99 | PR |
---|---|---|---|
Mile | 6:00 | 5:20 | 5:47* |
1.5 mi | 9:10 | 8:22 | 8:58* |
5k | 21:12 | 17:38 | 19:47* |
10k | 41:17 | 34:24 | 42:31 |
Half Marathon | 1:33:04 | 1:18:07 | 1:32:58 |
Marathon | 3:08:42 | 2:44:18 | 3:58:05 |
50k | ??? | ??? | 4:56:35 |
24 hr Backyard ultra ruleset run distance | 50 mi | 100 mi | 55.9 mi |
Boolean satisfiability is a classic problem in computer science. Given a series of n boolean variables, A B C ... and a formula in 3-conjunctive normal form
CDCL is a complete and sound method, so the canonical solver line is also the number of solvable instances.
I came across an exciting problem in a stand up maths video.
Using a simple brute force graph-theory argument, a viewer had gotten the runtime of Matt's solution down from 32 days to 15 minutes. By throwing the book at the problem I thought we could do better.
With good fundamental knowledge of the english language and machine structures other authors
have implemented some variant of exhaustive search with substantially better runtime of 100 milliseconds. With fundamentally simple code.
However, using Mixed integer programming, ( a simple formulation around set covering) we can optimize this to 10 seconds (360 * 24 * 32) times faster than Matt's code.
An alternate approach using pySMT (satisfiability modulo theories) has an estimated runtime of around 11 hours. My formulation of this as a SMT problem finds one of the 11 assignments of words for this problem
1 hour.
These are overkill solutions but have a certain mathematical elegance to them, that makes me really happy.
How long can you run without going more than 1 kilometer away from your apartment or retracing your steps?
Unfortunately, not only can I answer this, I probably have enough material to write a thesis on this topic.
The ultimate (ongoing) goal is to take top place on this leaderboard.
The leaderboard basically scores a run by divinding its length by its diameter.
For example, if you run a 10k without retracing your steps and you stay within 1k of your home the whole time,
you'd have a score of 10 / (1 * 2) = 5. This puts you near the bottom of the leaderboard.
Turns out getting a good score depends on answering few different questions.
"Abstract Nonsense" is a somewhat loving, but somewhat derisive term for methods (typically
Category Theoretic methods) in pure mathematics that are unreasonably convoluted and involve a lot of theoretical machinery.
I myself am awful at Category theory but excellent at abstract nonsense, and I wanted a space to share my thoughts
and projects. I'm well aware that very few people will read this blog, but to me this space is a journal. A respite
from the giants that control the web, and a space to share my thoughts into the void, in a way I can control and moderate.
More concretely, I hope to maintain "Abstract Nonsense" as a dev log as sorts. Not because I think it showcases phenomenal
technical talent, but because it showcases some of the cool things I've been learning on the side.
I'll keep my first entry on this journal quite short. This entry stands well on its own.
Because it does something the category theorist in all of our hearts would love.
It's self referential.
The content engine that runs Abstract nonsense is quite brilliant if I do say so myself.
It is a python script tha takes in a series of html files, and agglomerates them into a single file.
In addition to the abstract nonsense engine I have two other python scripts that form the backbone of this (static)
website. I have a script that takes in plaintext of a quote document I have been personally maintaining for the
past 3 years. It uses regular expressions to parse out the quotes and build an html file that contains java-script
that builds a dynamic webpage this java script program alters the html on the page to create a typing effect.
Check it out
here! The final piece of this beautiful infrastructure is a third script that runs both scripts than commits the whole branch to master.
As I learned on Twitter/Reddit/The Quote Document:
"Everybody has a testing environment. Some people are lucky enough enough to have a totally separate environment to run production in." - @stahnma
Abstract nonsense and this website as a whole is both test and prod. Maybe one day, I'll be a good enough engineer
to be able to invest in a test and prod for my website.
This is the official page for tracking my (Rohan Jhunjhunwala's) progress and funds raised during his Backyard Ultramarathon for Bidya.
Bidya is an organization which raises money to create scholarships for underprivileged females in Nepal.
With as little as 1500-2500$ in capital we can create a scholarship fund which can fund all annual K-12 educational
expenses for one Female student in need using the interest returns alone.
Because of the comparatively low costs in Nepal, and our relationships with certain high-schools,
we can ensure that every dollar Bidya fundraises will efficiently go to a child in need.
Donations to Bidya are 100% 501c(3) tax deductible. Rohan and Rakchhya are matching all
donations made to this Ultramarathon fundraiser. For every dollar donated, 3$ will go to funding
critical scholarships for underprivileged female Nepali students. To make a per-mile pledge reach out to
rjhunjhunwala80@berkeley.edu or feel free to donate directly.
This is a race without a finish line. Competitors run 4.17 miles every hour on the hour until there is just one athlete left standing able to complete the loop.
Any pledge, big or small, makes a big difference both to our scholars and to me when I'm exhausted out
on the course hoping to be the last man standing. As a thank you to our patrons. I will update this page live during the race, Saturday, June 1.
The javascript graph below will track the total funds raised so far.
The race also has an official Facebook Leaderboard.
If you'd like to drop some cheers in the comments there, we'd be really appreciative. Special thanks to all supporters and my team
that is coming out to this event.