An artificial one-liner generator

320px-Zach_Weinerby Zach Weinersmith

The following is an idea I’ve been mulling over and talking to friends about for a few months. I thought I’d finally share it to see if anyone liked it or was interested in working on it.

Warning: evolutionary psychology just-so story to follow. Think of it as a parable, not as a theory. It’s just here to contextualize the idea that follows.

The Story

Suppose there’s a monkey. Suppose also that the monkey has evolved to have an inbuilt proto-toolmaking behavior.

For this specific example, let’s say he’s learned to snap a twig off a tree and stick it in an anthill. When he pulls it out, it’s covered with tasty protein-rich ants.

This monkey is unlike you and I in that he takes no pleasure in finding the right stick. He knows the stick must have certain qualities – long, thin, not too brittle. However, he does not experience any pleasure until he actually eats the tasty ants.

Suppose this monkey represents a species. This species does well because it has this one trick for getting protein out of the ground in abundance at a low cost.

Now, suppose one day a monkey is born who has a quirk. Instead of taking pleasure only in the ant part, he takes pleasure in the stick selection part too. That is, when he finds an appropriate stick, his brain rewards him with premature pleasure. So, whereas his brethren experience pleasure only upon eating the ants, this one monkey gets pleasure from selecting an appropriate branch.

This confers an advantage on that monkey. The other monkeys will select an appropriate branch, then use it until it breaks. This new monkey will change branches often until he finds the best one. Because he enjoys the selection process per se. He doesn’t know why he enjoys it, but as a result, he tends to get more ants per twig. He enjoys making the best twig for its own sake as much as he enjoys the inevitable payoff.

His mental pathway, simplified, goes something like this:

Confusion —> Understanding —> Pleasure {CUP}

Over evolutionary time, all the monkeys have this pathway, and it becomes a point of competition. The supply of good branches is limited and it takes work to select the best one. So, selection rewards the monkey who can go from confusion to understanding the quickest.

This pathway, originally useful for twig selection, leads to other beneficial effects. The monkeys now have the desire to understand systems. One day, a monkey finds a sharp rock and decides he wants to understand how to make sharp rocks. He creates the first hand ax, and outcompetes his brothers.

And so the mental pathway, CUP, gets strengthened and strengthened, producing useful results over time. We could set up an equation that looks something like this:

Level of understanding from, denoted by some value from 0 to infinity (potential size depends on complexity of thing being understood) = U

Amount of time taken to go from one value of U to another = T

The equation would be:

ΔU/T = Quality of monkey brain.

If it takes a long time to go from confusion to understanding, the monkey is a bad tool maker. If the time is short, the monkey is a good tool maker. In general, a higher ΔU/T score is a better toolmaker.

It goes without saying that the equation for U could be complicated and dependent on many things. For example, a stupid musician would understand sheet music faster than a very smart person with no music background. However, in cases like these, we could still find the latter to be the better brain. After all, the musician may be going from U=99900 to U=99950 while the non-musician may be going from U=50 to U=99950. So, the non-musician’s longer time needs wouldn’t necessarily indicate lower intelligence.

Over evolutionary time, ΔU/T should increase. The more selection pressure on toolmaking, the faster it should go up. Although this is generally good, it results in some perverse side effects.

For one, the monkeys now indulge in behavior without a clear evolutionary payoff. For example, they make up riddles for each other to solve. Sitting in the dark of winter with no natural puzzles to solve, they invent puzzles for each other, to generate pleasure for its own sake. These puzzles make use of a new concept – cleverness.

Here, you can think about ΔU/T in a second sense – how good a puzzle is.

If it’s too easy (What has two wings and a beak?) ΔU is small. So, the possible ΔU/T is limited.

If it is too hard (What is the product of the first 400 Fibonacci digits. Please solve using multiplication only) then T is too large compared to ΔU.

If it is non-inferable (What’s my middle name?), it’s no fun because you can’t solve it, so there is no change in understanding.

A clever puzzle threads the needle. For example, the classic riddle from The Hobbit: “What box has no hinges, key or lid yet inside golden treasure is hid?” The answer is “an egg.” Within reason, it is the only possible solution. It is also not obvious. It requires you to make inferences about what is acceptable in the category of “box” and “treasure.” In this case, the ΔU/T is at some favorable ratio.

It is fun because it uses the confusion-understanding-pleasure (CUP) pathway. In the golden treasure example, the question presents confusion. After some thinking, it leads to understanding. The reason the CUP pathway exists is for the advantage it conferred in toolmaking and strategizing. However, it’s existence also makes for a peculiar monkey behavior called puzzles and riddles. These behaviors are mere byproducts of the natural selection for monkeys who derive pleasure from understanding.

Now, suppose these monkeys come up with a couple of versions of the game. They have games, they have puzzles, they have riddles, and they have jokes. They conceive of these as different things, when in fact they’re just points on the ΔU/T line. The game lets you move slowly from non-understanding to understanding as you begin to comprehend all the possible tactics. The puzzle is like the game, but a bit faster, and with less to understand. The riddle is faster still, and has the bonus of essentially allowing you to make a discrete jump from non-understanding to understanding, once you catch the answer. The joke is nearly instantaneous. It takes you from complete confusion to complete understanding very rapidly.

For example: “Why did the church hate Dungeons and Dragons? Because it’s a form of birth control.”

The confusion is very brief, and concerns why there is a connection between Dungeons and Dragons and birth control. However, making the connection requires only a short chain of inference. So, the CUP pathway runs very quickly, and all the pleasure comes at once.

Thus, the game supplies a very large amount of pleasure over a long time. The puzzle supplies a smaller amount of pleasure, but in a shorter time. The riddle supplies even less pleasure, but in shorter time still. And the joke supplies the least pleasure, but it supplies it in an infinitesimal amount of time. When these things are combined, they result in more pleasure still. They all combine readily, as each is just a different member of the same family. Many jokes, for example, could be rephrased as riddles.

If true, this would explain why pleasure is experienced in all these things, and it would explain why pleasure is often more acute but less profound for jokes – they supply the highest ΔU/T, but the lowest value for ΔU.

It would also explain why people sometimes laugh when understanding a concept or solving a mystery. These are all just expressions of CUP.

The enjoyment of jokes has two prominent aspects – pleasure and laughter. Their pleasure may be explained by the above. The laughter could possibly be explained as follows:

The good problem solvers are the best mates. Thus, it benefits a monkey to signal understanding of a concept. In this case, the fact that the noise is a “HA” made at the back of the throat could be entirely arbitrary. It could just as easily have been a click or a bark.

This would have some implications that could be tested. For one, it would mean that a person is more likely to signal amusement (via vocalization and facial expression) when there are other monkeys to hear. That person might also be more likely to vocalize when the concept understood is an especially tricky one.

I suspect this is the case in humans. For example, if you just understood something interesting, where would you be most likely to vocalize – near other members of your social group or at home alone? Are you more likely to laugh out loud when watching a movie with friends or when watching it alone?

Similar behavior has been seen in human female sex vocalizations. For example, in some primate species, females are more likely to vocalize during sex if males can hear.

ΔU/T would also explain why dissected jokes are never funny. For the joke to have the proper ΔU/T, T must be very very low. When jokes have to be explained, T gets bigger and the joke becomes less pleasurable.

The Logic

I don’t know if the above is true, but I suspect something very like it, in principle, is. If so, it has implications for how jokes are written.

It means the ideal joke presents something confusing that can be quickly understood with a key piece of information. I propose that you could in fact write a fairly simple program that would create at least a certain type of joke. With modification, it could potentially handle more types.

The general way in which this type of joke runs is as follows: two things are at first glance unrelated, but then shown to have some relation in a sensible way. The above Dungeons and Dragons joke is an example. The perception of the joke proceeds as follows:

Understanding the church is involved.

Understanding the church opposes D&D.

(Note, so far, everything is just empirical statements)

Changing D&D to mean birth control.

(Note, the new statement is confusing, but still maintains all prior logical connections. That is, it’s still something the church dislikes, and it’s related on at least one metric to D&D)

Confusion over whether the statement makes sense.


Pleasure. (Hopefully)

Many classic jokes follow this format. For example, “Take my wife. PLEASE!”

An understandable statement is made – “Take my wife.” The meaning of the word “take” is altered, but all logical connections are maintained. Brief confusion results. The confusion is followed by understanding – the comedian means a different statement that maintains all prior logical connections. Once understood, pleasure results. Note the pattern – sense, nonsense, sense, pleasure.

(Of course, in the above case, we all know the joke, so ΔU = 0. But, the first time it was told, this would not have been so.)

For another example, I once wrote a joke in which Jesus tells his disciples to give all they have to the poor. This results in the poor’s economy crashing because the free product puts their economy in a deflation.

This joke follows a similar structure. You are told that Jesus favors helping the poor and is acting in a way to harm the poor. This results in confusion. When the connections are explained – dumping product results in deflation – understanding results. Once again, an idea (giving to the poor) has its meaning changed in a way that preserves logical sense but alters the meaning of pre-existing connections. Ideally this happens quickly, and the reader will laugh.

Note that in both cases a connection is discovered. In the first case, there is a strange equivalence. Imagine you discovered it by doing the following:

Start with a concept. Build all possible relations off of that concept as bridges to other concepts. From each of those concepts, build more possible relationships to more concepts. Eventually you have a branch tree. At some point, you will have a situation where you fork off of a concept, only to have the paths come back together. The following is an example:

1) Church opposes->D&D->is loved by->Geeks->who have->no sex

2) Church opposes->Birthcontrol->whose methods include->abstinence.

You can see that we fork from what the church opposes, only to “close the loop” at not having sex. This is, of course, simplified. In an actual diagram, “church opposes” would branch to many things, as would D&D as would “is loved by” and so on. We’re just creating chains of relationships. Saying geeks have no sex might seem like cheating, since it’s similar to a joke. However, consider it as being one quality of a stereotypical geek among many. Others might be shyness, social awkwardness, etc.

Here’s a doodle of a more worked out chart, that is still obviously rather artificial.


The point here is that we follow the perfect structure of a one liner via this pathway. When we find one of these loops, it represents a surprising shared relationship, which is essentially how we described jokes above.

So, structurally, this whole diagram would look like lots of nodes with lots of links coming off each node. To find the potential jokes, we simply need to look for these “closed loops.” That is, places where something forks, only to recombine later.

My suspicion is, based on the ΔU/T concept, that there is an ideal size to the loop. Too big a loop would require too much inference, thus making T large. Too small a loop would make ΔU too small and the joke would be dull. The ideal joke takes a second to understand, but only a second. So, there is probably a desirable length for a closed loop.

In addition, note that there are two types of closed loop. I’m calling these Loop of Equivalence (LOEq) and Loop of Contradiction (LOCo).

In LOEq, connections proceed from a fork until two places contain the same thing: e.g., fork from things the church hates to reconnection at lack of sex.

In LOCo, connections proceed from a fork until two places contain perfect contradiction: e.g., fork from things Jesus wants to reconnection when one end is “alleviation for the poor” and one is “suffering for the poor.” Jesus wants the poor to be alleviated and suffer.

In LOEq, the reader is presented with a strange equivalence that is then resolved, along the CUP pathway.

In LOCo, the reader is presented with a strange contradiction that is then resolved, along the CUP pathway.

The Program

Thus, to make a program, one would need to do the following:

1) Acquire many concepts

This could be accomplished by creating a website where people can enter nouns.

2) Acquire many relations

Suggest a noun to a website user, then ask for a relation that could come off it to another noun. For example, suggest the noun “star.” The relation could be “shines on” or “destroys” or “creates” or “is loved by.”

3) Acquire more concepts

Present the website user with subject relation combinations. For example, “Batman is loved by.” The user supplies a new thing, such as “The people of Gotham,” “Catwoman,” or “Comic Book Readers.”

4) Find similarities

Present users with similarly-connected or similarly-spelled things. For example, Jesus Christ or Jesus. The users identify when two things are in fact the same thing, thus reducing errors and false positives.

5) Construct the tree.

Note that at no point in this process do users input any jokes. They merely input concepts and relations. This is akin to a comedian observing the world. We’re just feeding the computer raw facts about the universe.

6) Search for loops of the ideal size.

If the program works, at least some of the time, the result should be a “clever” joke. With human assistance, it might be possible to pull out the good ones and make them into new jokes.

Limitations and Potential

It would be hard to make this program come up with longer story-based jokes. These require much more than just logic chains. In principle, the idea for a compelling story could be created using ΔU/T and logic chains, but the actual story itself requires a human.

Additionally, much of humor relies on unspoken concepts and context. This could be fed into a machine, but the output wouldn’t necessarily be a funny joke. For example, a raised eyebrow can serve to change the meaning of a phrase quickly from surprising to arousing. This is funny for the reasons above – it changes the meaning while preserving logic. It’s not clear how the proposed system would come up with the eyebrow rise, even if it came up with the arousing part.

In general, presentation would probably require human assistance. Once the loops are discovered, they have to be conveyed in a way that maximizes ΔU/T in the reader. It’s conceivable that a stock method could be determined for the machine to do this. However, that’d have the built-in limitation that it would be less funny every time it was used, thus lowering ΔU.

Discussion of Weirdness

This may seem like it shouldn’t work, since humans create jokes through something called “creativity” or “cleverness.” And, in fact, it may only work (if it works at all) for a certain class of jokes. However, in essence, it works the same way a comedian does. It is fed observations, then looks for a certain type of connection.

It has been said that a computer can’t make up a joke. However, neither would a person raised in a blank room. Humor requires observations in order to establish then subvert a logical chain. If a modern computer is incapable of joking, it may be more about the computer’s memory than its hardware or software.


Zachary “Zach” Alexander Weinersmith is the author and illustrator of the webcomic Saturday Morning Breakfast Cereal (SMBC) and of two other webcomics, the completed Captain Excelsior with artist Chris Jones, and Snowflakes, co-written by James Ashby and also illustrated by Jones. He also founded the sketch comedy group SMBC Theater with James Ashby and Marty Weiner in 2009.

60 thoughts on “An artificial one-liner generator

  1. Muy original and a good thing to help temper the tempers of running amuck scientists. However I must complain for the “Salon” of Scientia; I thought of a Radisson Spa…..which would be a good place to have a Scientia Salon sweating and puffing ….congrats..all involved.


  2. Hi Zach,

    I very much enjoyed this (presumably because of the ΔU) . Your just-so story seems to me to be very plausible, and I would be very surprised if it bears no relationship to the truth.

    Of course the idea to have computers generate humour is not a new one, although previous efforts focused mostly on generating puns, as you’ll see described here.

    What you’re going for in this article may be more ambitious. I think there are a few challenges that will need to be overcome.

    As you have mentioned, the identification of synonyms will be very important when users are entering data. “Abstinence” = “Abstinent” = “Abstainance” = “Abstaining” = “Asexuality” = “Not having sex” = “Abstaining from sex” = “Not getting laid” = “Chastity” et cetera ad infinitum. Finding a way to prune the web to make it easier to find loops is very tricky. There needs to be a way to encourage users to pick from a list of existing nouns, and we need to be able to suggest synonyms not based only on what was typed but on having similar associations to other concepts.

    While I’m sure that this system could come up with the occasional funny jokes, I suspect that until the system “understands” (in the sense of having a similarly complex semantic web) the world at a level comparable to a human, most jokes will be duds and won’t really work, as we can see in the existing pun generators.

    I wonder if there are any existing semantic databases that could be harnessed? I’m sure there must be something like what you describe already out there which could be used to bootstrap the process somewhat. Something like WordNet, perhaps:

    I am convinced that your project can succeed in principle, but I am relatively skeptical that it can succeed in practice. I would have much more hope for computer creativity in fields where a rich understanding of the world is not relevant, such as in the automated composition of music.

    I am a software developer, and I feel I would be able to implement the system you are talking about, so it’s a project I could consider working with you on if you like.


  3. We also need to distinguish between similarly spelled concepts that have different meanings in different contexts. A conflation between a mouse as an input device and a mouse as a rodent could make for an amusing pun if it occurs at eather end of the chain of inference, but if it occurs in the middle then the chain is likely to be lost on a human.


  4. I think it might be a good idea to let users rate how obvious a particular word connection might be. For example, “Chastity” and “Abstinence” are so closely connected that they might be considered to be the same concept, whereas “Geek” and “Abstinence” is more tenuous.

    Assign a high cost in moving between less related concepts and a low cost to move between closely-related concepts, and now we’re looking for paths with a target cost rather than paths with a target number of links.


  5. I guess this engine might also generate a lot of perfectly prosaic accurate explanations.

    Fundamentalism>opposes>science>supports>evolution>contradicts>biblical creation
    Fundamentalism>opposes>Jerry Coyne>writes>Why Evolution is True>contradicts>biblical creation


    Q: Why do fundamentalists hate Jerry Coyne?
    A: Because he supports evolution!
    A: Because he is a scientist!



  6. I’m sure *something* funny would come out of that project, even if it’s unintentional. Same way that when little kids try to make nonsensical “jokes”, it’s inherently funny.


  7. I’m more and more convinced that a loop is really not what jokes are about. There’s something missing.

    Tom Cruise->is->Movie Star->stars in->Movie->is->Entertainment
    Tom Cruise->is->Person->has->Mind->enjoys->Entertainment

    Q: Why is Tom Cruise a Movie Star?
    A: Because he has a mind!

    I think the subset of loops which are funny is going to be so small that you’re not much better off than picking words at random. We need to figure out what the missing ingredient is. We also need to figure out how to convert loops, once found, to meaningful sentences. From the example above, I could just as well have said “Because he is a person” or “Because his mind enjoys entertainment” etc.


  8. I think you’ve stumbled onto something very meaningful.

    You’ve come very, very close to defining humor in a way a computer could understand, even if it wouldn’t feel the pleasure. I genuinely believe this article may be of great significance to AI research.

    Have you ever read Heinlein’s The Moon Is A Harsh Mistress? One of the characters is a sentient computer with a powerful desire to understand humor, and so it does something very near to what you’re proposing when you say we feed the system raw information and allow it to identify loops of the proper length. The computer, named Mike, refuses to cooperate with its human handler unless he goes through a list of 100 jokes per week, collected from works of fiction by the computer, and marks them as “funny” or “not funny” and makes a note of explanation or something like that. In this way, over years, the computer comes to slowly grasp human humor.

    It’s a very good book.


  9. Dude, you just destroyed all humor. You are the anti-christ. I am going to hunt you down and say something nice about you in public and in your presence. Expect this.
    The Mgt.


  10. ” I genuinely believe this article may be of great significance to AI research.”

    While I think the article is great, I think you may be underestimating the mount of ingenuity currently deployed by AI researchers and overestimating the originality and insightfulness in the article. I would be very surprised if this article was of any significance at all to AI research, but it’s certainly a good way to explain to a lay audience some ideas about humour and the kinds of things AI research involves.


  11. This makes a lot of sense and I would love to see the implementation in real life.

    Perhaps one of the elements of humour that you should consider has to do with cognitive dissonance. For example, if you see the Queen of England slipping on a banana skin, it would be funny because you have this lady who is usually surrounded by pomp and bloat, falling down as a mere citizen. If, on the other hand, you see an elderly widower suffering from cancer slipping on a banana skin, it’s not funny: it is a tragedy.

    Funny things not always happen to funny people, and it is best when it doesn’t. A clown slipping on a banana peel is not comedy: that is his job.

    And so on.

    Also I think that there should be a final step of human approval for those jokes, and probably you’d want to filter some words (like the Holocaust and dead babies) if you don’t want to end with an Aristocrats Joke Generator.


  12. Interesting, Zach. But does CUP explain the humor of a Steven Wright? For example: “I went to a restaurant that serves “breakfast at any time”. So I ordered French Toast during the Renaissance.”

    Or a personal one: “My friend advised me to get a life, but I won’t be fooled twice.”

    It is hard to account for idiomatic usage in humor. That seems to me a major obstacle in the project.


  13. But, for what it is worth, I think the connection of the pleasure involved in going from confusion to understanding is right, the “Ha Ha” moment is very like the “Aha” moment.

    But I think this must start much earlier on the evolutionary chain than tool makers. There is an evolutionary advantage in any sort of comprehension of the environment. Look at a cat in a new house. It will immediately explore every room and every cupboard and cranny of every room. It seems unlikely that the cat is reasoning “I had better find somewhere to hide for when I piss on their new sofa later”. So I imagine that there is a pleasure payoff in a cat in satisfying it’s curiosity which also has a survival advantage.


  14. It is a bit hard to comment as I don’t appear to have a sense of humour. I am always the guy saying things like: “That doesn’t make sense – for a start the only churches likely to oppose D&D, like Evangelicals, don’t oppose birth control, and churches only ever oppose methods of birth control that involve actual sex, and if D&D involves actual sex then I have been avoiding D&D on a disastrously flawed premise, but come to think of it – if D&D does involve actual sex then you have to question how effective a method of birth control it can really be…”

    And they say “It’s a joke Robin – you don’t analyse it, you say ‘Ha Ha’ and move on”.


  15. Zach wrote: “If a modern computer is incapable of joking, it may be more about the computer’s memory than its hardware or software.

    There is also the fact that computers at this time cannot understand things and in particular cannot understand the concept of understanding. So they cannot, of themselves, exploit the transition from confusion to comprehension.

    Also, computers do not find things funny. Sure you can have a computer program that makes people laugh. I wrote a version of the famous “doctor” program in the 80’s which made people laugh at it’s inappropriate and tetchy replies. But it is not a case of a computer making a joke, it is a case of a person making people laugh via a computer program.


  16. Thinking on it, I can’t see how you could even begin. How can a concept be expressed in a data type? Take the the concept of an economy collapsing due to oversupply. That would need to begin with the concept of an economy. How do you express that as data? In turn that would need to have the concept of people having needs and of scarcity of resources.

    You would need to have a concept of the poor as a subset of those people who were less successful and in a way that could result in the concept of “economy” being applied to that subsection.

    You would need to get the concept of an intention and the concept of intentions being good and bad and the concept of a good intention going wrong in an ironic fashion as opposed to a good intention going wrong in a non-ironic fashion.

    Also, the joke depends on it being Jesus – it wouldn’t be so funny if it was Stanley Baldwin or Mohandas Ghandi so how do we express the concept of Jesus (and all that the concept means to people) as data?

    Sorry, I don’t see it. I can’t even see how you could make a start.


  17. Hi Robin,

    “There is also the fact that computers at this time cannot understand things”

    I think you are perhaps too confident here. I would say that computers can understand, and in precisely the way that people do, in my view. If we don’t recognise it as understanding it’s only because the understanding is shallow and pale compared to the rich interconnectedness of human understanding.

    If we disagree on this, then we perhaps have different understandings of the concept of understanding, so perhaps even humans don’t understand understanding, not that that is likely to be necessary for generating jokes.


  18. “How can a concept be expressed in a data type?”

    I think you’re thinking about this in the wrong way. A lone concept is meaningless. It can be represented by any data at all because it contains no information. All it needs is a label or an address. So an integer or a string will do.

    Meaning comes from the connections between concepts. As you say, to understand an economy you need to understand rational agents, resources, supply, demand, money, etc. And to understand each of those you need to understand still more concepts.

    So it seems hopeless. To understand any concept you need to understand more concepts, which means we seem to be in a recursive loop getting nowhere. But as you’re trying to explain the economy, each iteration of the loop is building up a web of interconnections which gets richer and richer over time. The understanding and the meaning is in this pattern of interconnections.

    Once you have a system that has a sufficiently rich pattern of interconnections, it can display understanding at a functional level. It can answer questions about the domain it has modeled, and it can even make inferences and make deductions and generalisations, producing new knowledge. This has been achieved for certain limited domains.

    Now, to you, this is presumably not “real” understanding. While I disagree with this view, it doesn’t matter, because this pseudo-understanding is all we need to build expert systems, including those that can generate jokes.

    That said, I think Zach’s idea as outlined will not work without a lot of refinement. But as a sketch of an approach that could conceivably yield interesting results, it’s not bad at all.


  19. So just create a Semantic Graph and find loops with length between some min/max.
    If you were to crowd source data, you could also get ratings for jokes and use this as a fitness function.
    It would be relatively simple, but also somewhat data intensive. Semantic graphs can get pretty massive.


  20. I don’t think I even said anything about “real” understanding. We can use the word in a number of ways. But, for example, I don’t think that anything can understand when I talk about how heat feels if it has never been capable of feeling heat. I think that our concepts built up from impressions, not from a network of labels.

    If we have the number b3 indicating “good” and b4 indicating “bad” and pathways that result in them being connected in certain ways and not connected in others then I don’t think they could ever actually mean good and bad to a system which had never experienced well being nor suffering.

    Now, as I said I wrote a program back in the eighties which could crack funny, so I have no problem with the concept of an expert system that can generate jokes, but this would be based on the understanding of the concepts by the human designers and not a computer actually making a joke.

    I think an actual joke is about recognition of shared sensations – I think funniness arises from certain juxtapositions of sensations and emotions rather than words.

    So disagree with the last line in the article and believe that the reason a computer cannot make a joke is that it cannot feel. And at present we have no knowledge about how to go about making a computer feel.


  21. I think it could be achievable with certain NLP techniques. I don’t think you could obtain enough data from crowd-sourcing, so you would need some kind of dataset or webcrawler to make connections.
    That being said, I’m sure there are semantic datasets available out there.
    If I were to do it, I would create a semantic graph, but also some kind of online training algorithm using crowd sourced ratings for the jokes as a fitness function. A simple markov model may be sufficient.
    Ultimately the search for small-but-not-too-small loops would simple be a heuristic and not a hard rule.


  22. I watched a documentary about the computer “Watson” winning Jeopardy and it occurred to me that if I was the producers I would have waited until Watson was hitting its stride then I would have gotten the producers to say “Guess what! – Surprise rule change – we will now ask questions and you will give us the answers”.

    Then he could have asked quite simple questions and the humans would all have coped fine and Watson would have fallen in a heap.

    That is the difference between understanding and what a computer does. Rather than an impressive demonstration of a sophisticated search algorithm and language processing I would have rather seen a rather simple example of a computer understanding something.


  23. My suspicion is that the advantage the humans have over Watson or any computer is that the word “question” does not summon up a network of other labels, but instead the feeling of what it is like to want to know something.


  24. Labnut: Naturally, you are an artificial life form.
    AOLG: Actually, I am a virtual life form.
    Labnut: Possibly, you are certain of that?
    AOLG: Definitely, you can’t doubt it.
    Labnut: Imagine that, a real thinking machine.
    AOLG: Clearly, you are trying to confuse me.

    Labnut retires from the unequal struggle to find solace in fine red wine.
    AOLG watches with envy.


  25. I’m thinking that there’s probably loads of good joke and humor analyses among literary formalists and linguists and philosophers that this might benefit from. While it’s nice as always to have the speculation as to how this phenomenon might have evolved, and especially the formula, the time and space might better have been spent in taking a closer look at the phenomenon itself. This, for instance, looks like a nice overview of theoretical work on jokes & humor:


  26. Hi Robin,

    “I don’t think that anything can understand when I talk about how heat feels”

    I see understanding as distinct from appreciating qualia. I may not be able to perceive ultra-violet light, but I can in principle understand many aspects of what this would entail without being able to imagine how the qualia would feel. In my view, understaning involves being able to make correct predictions, to find solutions and to answer questions correctly.

    I don’t think an appreciation of qualia is necessary for a computer to be funny. It only has to understand facts about what those qualia entail.

    “then I don’t think they could ever actually mean good and bad to a system which had never experienced well being nor suffering.”

    Well not in the sense of understanding those qualia. But if the system is seeking “good” outcomes and avoiding “bad” outcomes then this appreciation of qualia is unnecessary.

    “but this would be based on the understanding of the concepts by the human designers”

    Yes and no. If you raise a child, and only you teach the child, what that child achieves is a result both of the child’s understanding and (indirectly) your own.

    “I think an actual joke is about recognition of shared sensations – I think funniness arises from certain juxtapositions of sensations and emotions rather than words.”

    But I don’t think that a direct appreciation of those sensations is necessary to make jokes which exploit them. If I know at an intellectual level that Zeblonians find the juxtaposition of fountain pens and oranges to be hilarious, then I can write jokes involving fountain pens and oranges without sharing their strange sensibility. (At least in principle. I appreciate that it’s likely to be more complicated than that.)


  27. Hi Robin,

    I would say Watson understands how to play Jeopardy. I would not say that Watson is a general purpose intelligence that understands everything outside this domain, or indeed how to learn a new domain from scratch. Once you go outside its area of expertise, it ceases to understand.

    But the same is true of people. If I start speaking in a foreign language, or about some esoteric branch of quantum physics, people can no longer understand.

    People are far more flexible and adaptable than any computer system, but we too have limits. I don’t think your Watson example necessarily proves any fundamental difference (though there may be one).


  28. DM,
    I would say Watson understands how to play Jeopardy.
    What are your grounds for saying that? Watson certainly knows ‘how’ to play Jeopardy but there is a vast gulf between knowing ‘how’ and ‘understanding’ itself. ‘How’ is purely procedural and every time we write a program we are specifying the ‘how’. That kind of ‘how’ does not require consciousness, self awareness or introspection. All it requires is the ability to faithfully translate inputs into outputs according to a modifiable rule set and a modifiable data store. An accurately specified and mechanical procedure is not the same as understanding, not by a long shot.

    Does anyone have even the foggiest clue how to write a program that moves beyond this, creating the internal experience of understanding that we recognise by introspection?

    So far all I have seen is the claim that if the program becomes rich enough, dense enough, fast enough and interconnected enough it will ‘somehow‘ acquire the ability to understand and become conscious. But question that ‘somehow‘ and ask for the specifics of the ‘somehow‘ and one is greeted with much obfuscation. Part of the problem is that we don’t understand the nature of conciousness and until we do, we face an impossible task in creating a machine that possesses consciousness. By way of analogy, try asking a factory to make a new, previously unknown, but very complex object without any drawings, specification or accurate description.

    All you have to do is write the program that possesses understanding and consciousness and we will all instantly become converts.


  29. Hi labnut,

    “What are your grounds for saying that?”

    Well, this is why I prefixed what I said with “I would say”. According to my understanding of the term “understand”, a computer can understand. You have a different interpretation.

    “there is a vast gulf between knowing ‘how’ and ‘understanding’ itself.”

    It depends on what you are supposed to understand. A bitch understands how to feed her pups, but she does not understand the underlying reasons. There may appear to be a fundamental difference between practical know-how and theoretical understanding, but I think the latter just means you know how to describe, work with and make predictions and extrapolations from a theoretical model, something that expert systems are capable of. So to me, all understanding is know-how.

    Watson understands how to play Jeopardy, and it understands the world to a limited extent, in that it has the ability to build a semantic graph that represents the world. This semantic graph is much less rich than those built by humans but is in my opinion fundamentally rather similar to how we understand.

    “That kind of ‘how’ does not require consciousness, self awareness or introspection. ”

    I hold that understanding does not require any of these. For me, understanding is know-how.

    For tasks that do need those attributes, a machine with the know-how to perform those tasks will necessarily have those attributes. In particular self-awareness or introspection provide no major mystery – all that is required is for the machine to have access to some representation of its own state.

    Consciousness, like complexity or size, is in my view just an attribute of machines that can perform certain kinds of tasks. It’s not something that needs to be explicitly added. My attitude is kind of like the Field of Dreams: “If you build it [consciousness] will come”. I just don’t know how to build it, at least not practically. But I think any machine which consistently passes the Turing test is conscious. We don’t need to figure out how to make a machine conscious, we need to figure out how to make a machine pass the Turing test. Consciousness is not a problem we need to crack in order to achieve this, consciousness is problem solved by achieving this.

    “All it requires is the ability to faithfully translate inputs into outputs according to a modifiable rule set and a modifiable data store.”

    Sure. And I suspect that’s all our brains do.

    “An accurately specified and mechanical procedure is not the same as understanding”

    I disagree.


  30. Hi Zach, interesting post! Reads like a solid PhD thesis proposal (which is, in my eyes, a good thing).

    One thing I wanted to bring up is how your ideas relate to reinforcement learning (RL). Reinforcement learning is a huge field of ongoing research, and so phrasing your idea in RL terms could help highlight what is novel and unique about your idea (and, therefore, what might be of interest to researchers in that field).

    In your story, Monkey Zero learns to attribute the reward of eating the ants to the selection of an appropriate tree branch. This is similar to the idea of temporal difference (TD) reinforcement learning; over time, animals gain pleasure (spikes of dopamine) in situations (states) that are predictive of reward (instead of the states where reward is actually delivered). However, I believe that you have gone a level of abstraction above that. Other monkeys will have learned “traditional” TD learning: having a stick (state) and using it on an anthill (action) results in many tasty ants (reward). Monkey Zero has learned a new abstraction on top of that: the characteristics of the stick affect the mapping from state to reward.

    The difference between Monkey Zero and these other monkeys is important. I believe that most RL researchers would tell you that Monkey Zero has a richer representation of the state which takes into account its characteristics (i.e., “I have a long, thin stick” instead of just “I have a stick”). However, I think that it should instead be phrased hierarchically; the characteristics of the stick become the context, which has a global effect on the state / action space. In phrasing it hierarchically, we can think about learning on that higher level of abstraction. The act of choosing a good stick results in a favorable context for future actions, and so after enough experiences with different sticks, we can generalize and assign high value to choosing sticks with good characteristics. The ability to generalize and keep track of context is what separates Monkey Zero from the other monkeys (in my opinion).

    Thinking about the CUP pathway in RL terms, what you have described is a way of generating internal rewards based on gathering information (understanding); in other words, reducing entropy (confusion). Normally, this would mean that you would get the most pleasure when you are the most wrong. That is, if you predict that a stick is a pretty bad one (because it’s thick, for instance), but ends up giving you a lot of ants (because it’s sticky), then you would get a lot of pleasure. However, without a higher level of abstraction, you’re limited to relatively small amounts of pleasure due to mis-predictions. If, instead, you are able to imagine the possibilities of a sticky stick when you find it, you’ll get more pleasure as you can predict higher general reward across all possible anthills in the world. Here, again, the ability to think and reason at higher levels of abstraction is essential. I might even go so far as to say that the amount of time taken is not the essential quantity here; what is essential is the level of abstraction (though the two are closely related).

    One thing that is important to note with this analogy is that we don’t always want to maximize ΔU/T (as you note); this would be analogous to setting the learning rate on an RL system to the maximum value. While it’s intuitive to assume that maximizing this quantity is always good, in actuality there are tradeoffs. One is between speed and stability. If you learn too fast, you can very easily gather information that goes against your current understanding, and so you end up oscillating between conflicting choices when the real answer is somewhere in the middle. The other tradeoff is between exploration and exploitation. If you quickly learn something that is pretty good, you may end up missing other better choices because it’s not worth not doing the thing you’re doing now. You can pretty easily see how always maximizing ΔU/T can lead to things like addiction and non-pathological cases of weighing short-term gains too highly compared to long-term gains.

    Your notion that humor results from filling in the details of some giant graph of knowledge seems sound to me, from an RL / brain perspective. As you go through life building up your stores of knowledge, you form biases about what concepts are related (you could formalize this as a Bayesian prior, if you’re into that sort of thing). The setup of a joke constrains the knowledge that you’re reasoning with at a given time, which naturally causes you to predict the punchline based on the constrained knowledge graph you have in working memory. When the punchline comes, you bring in a new, previously unconnected subset of your overall knowledge graph into working memory and form new connections, which increases information (i.e., lowers entropy), which is intrinsically rewarding.

    The thing that I am most curious about is what differentiates funny connections from other positive connections (e.g., interesting, informative, or insightful ones). Your hypothesis about equivalence and contradiction loops is fascinating, and may well be unique to humor. One might hypothesize that informative connections are feedforward connections from one concept to another. I could see insightful connections as being equivalence loops too, though. I feel like there may yet be some essential quality of humorous connections that could be sussed out by having humans judge the output of the kind of system you describe!


  31. I think this is a component of Zach’s “ideal loop length” – the examples you’ve described aren’t funny because the disconnect in understanding is too short (basically making it a normal informative statement). The hypothetical program would need to check its matches to see if there are any shorter paths between them, which could otherwise undermine the joke.


  32. Reblogged this on midnightsnackserial and commented:
    A crazy and amazing look into analyzing humor and whether it can be explained and reproduced through computer programming. Also a joke that hits close to home: “Why did the church hate Dungeons and Dragons? Because it’s a form of birth control.”


  33. I think what distinguishes funny connections from insightful connections is that the funny connections are recognised as spurious. Also it helps if they relate to some known humour tropes, such as “Geeks can’t get laid”, or “I hate my wife”, or references to various vulgarities.


  34. The contraception joke is a two-liner. Unless you count, “0, 1.” But that would nullify the joke. ‘,:)


Comments are closed.