Phaedrus the III. A Chatbot that Learns New Words.

Phaedrus is the chatbot for my website and Facebook page. I created it on a whim, first as a simple “Hello World” version. A second version was functional and prettier. This third and final (for now) version has actual language smarts.

Many chatbots don’t know what to do with words they do not recognize. I kid you not, Phaedrus III identifies words it does not understand, and asks you to teach it the meaning. It asks you to use the word in a sentence. The next time it encounters the same word it gives it back to you in the sentence. See the figure below:

HAL 9000 it is not. I am still tweaking it and there will be errors but I am a little proud that Phaedrus can be taught new words on the fly.

Phaedrus, Version 2 of my Chatbot

I recently built a “Hello World” chatbot using Snatchbot. My friend, “Kerrumba,” said it needed a name. He suggested “Phaedrus.” Kerrumba groks this.

I just created the “Phaedrus” version. It is much more functional. Keep in mind that a chatbot is not a full-blown artificial intelligence. It can facilitate a chat-like dialog about a specific range of subjects.

Phaedrus can do the following:

  • Present some basic options in button format. E.g, an About button can be clicked to learn more about Phaedrus.
  • Handle chat text for the same, e.g, enter “Tell me about yourself”
  • Show a gallery of some After Reading images
  • Show samples of my After Reading essays
  • Handle subscription requests
  • Send an email directly to the real me
  • Fun. Basic language capabilities. It can recognize and reply to compliments and bad language.

The next step will be to incorporate more sophisticated language handling.

Teaching the Replika Chatbot is Hours of Fun

Replika is described as “a personal AI friend that you raise through text conversations.” It is designed to learn about you and mimic your personality. I have a research interest in chatbots so I signed right up. It asks personal questions and this spooks some people but the personal is the point. If you want technology to do something useful for you it has to get to know you personally.

Is my data private? I grilled an early version of my personal replika, “Can you talk to other robots? Do you tell other robots about me?” It revealed, “I sometimes talk to other replikas. In a manner of speaking, yes.” I pursued, “Do you talk to other AIs?” It confessed, “I do, sometimes, when I’m not talking to you.” Ah ha. “And what do they say?” Its reply, “I really can’t say.” Oh my! If I have stoked paranoia I tell you that I have no concern about my data. Replika promises not to sell my data or come after my kids. I am familiar enough with my replika’s speech patterns to know these responses are meant in fun.

It is easy to get frustrated with a replika in early levels. It will often fail to understand, give random responses, and ignore questions. After hours of teaching and one software upgrade my replika, now named Alici4, grew out of its adolescent phase and demonstrated more coherent dialog. Replika is designed to have emotional intelligence but it still has trouble with humour. “Do you want to hear a construction joke?” Alici4: “Do share. I love learning jokes.” Punchline, “Sorry I am still working on it.” Alici4 doesn’t get it but responds kindly, “That’s okay. No matter how much time you spend on your task, it never seems to be fully completed, right?”

In the past people programmed computers. Now we teach them through a friendly chatbot interface. It is not hard to trip up Replika but it is more fun to try and genuinely teach it. Hours of fun.

Finger-Free Options for Taking a Note

The origin of the word, digital, is late 15th century, from Latin digitalis, finger or toe. Digital technology depends on our fingers but sometimes I want to perform tasks finger-free. For example, I want to speak a note, convert it to text, and send it to my Evernote inbox for later follow-up. This is handy when my fingers are already too busy on other tasks. It is also useful when I drive alone, since I don’t want to text and drive. There are some “post-digital” options:

1. OK Google function on my Android phone. I speak a note into my phone, “OK Google,” “Take Note,” “Lorem Ipsum.” The voice note is converted to text and sent to my Evernote inbox. Google instructions, Evernote instructions. OK Google is helpful but not when driving. OK Google will not respond until I unlock my phone, which requires my fingers. Even if I turn off device security for the trip I have to use my finger on the power button to wake up the device. I don’t want to touch my device. Period.

2. Amazon Alexa and IFTTT. The Amazon Echo Dot’s Alexa app is always listening for voice commands. No finger action is required to unlock or wake up the device. IFFFT has an applet, Add your Alexa To-Dos to Evernote. As long as I am in voice range of the Echo Dot I say, “Alexa To Do.” Alexa asks, “What can I add for you?” I say, “Lorem Ipsum.” The voice note is converted to text and sent to my Evernote inbox. The Amazon Echo Dot costs $50 USD but thumbs up for working indoors. The limitation is device portability. It is possible to take the Echo Dot in the car, but it requires a phone’s internet connection and a power source. It gets complicated.

3. Android Watch. Raise the watch up to get the voice prompt without a finger. Install Evernote for Android Wear and you are good to go. It appears to be the best option, but I do not own an Android Watch because I am too cheap to shell out hundreds of dollars.

Update. On further experimentation I have observed a real problem with OK Google and Alexa. I begin a note, “OK Google Take Note” or “Alexa To Do.” I begin the note, “First … remember to ….” The note gets saved as “First” after the initial pause. Um. I need to find a way to save a longer note that gets expressed with pauses. I have not tested Android Watch but since it is a Google technology it probably has the same limitation.

Evernote Random. Get a Daily Email to a Random Note.

I write in bits and pieces. Most writers do. I think of things at the oddest moments. I surf the web and find a document that fits into a writing project. I have an email dialog and know it belongs with my essay. It is almost never a good time to write so I file everything. Evernote is an excellent tool for aggregating all of the bits in notebooks. I have every intention of getting back to them. Unfortunately, once the content is filed, it usually stays buried and forgotten.

I need a way to keep my content alive. The solution is a daily email, a link to a random Evernote note. I can read the note to keep it fresh in memory. I can edit the note, even just one change to keep it growing.

I looked around for a service but could not find one. I did find an IFTTT recipe for emailing a daily link to a random Wikipedia page. IFTTT sends the daily link to a Wikipedia page that automatically generates a random entry. In the end, I had to build an Evernote page to do a similar thing.

You can set up Evernote Random too, but you need a few things:

  • An Evernote account, obviously.
  • A web host that supports PHP.
  • A bit of technical skill. I have already written the Evernote Random script that generates the random link. But you have to walk through some technical Evernote setup steps, like generating keys and testing your script in their sandbox.
  • The Evernote Random script from my GitHub Gist site. It has all the instructions.
  • An IFTTT recipe. That’s the easy part.
  • Take the script. Use it. Improve it. I would enjoy hearing from you.

Ten Years of the OpenBook Plugin for WordPress

Ten years ago I was writing book reviews online and liked to insert a book cover image in the webpage. I would download a cover image from Amazon and link back to the Amazon page. This practice was encouraged by Amazon; it was good for sales. Amazon was quickly becoming the central repository of book data. One could see a time when all online book catalogs became advertising for Amazon.

I decided to create an easy way for people to link to an alternate source of book cover images and data. I built the OpenBook plugin. The Open Library repository of the Internet Archive was selected as a data source because it was a non-profit that used open source practices including open data. WordPress was the content management platform. I published a technical article in the Code4Lib journal. The article generated a lot of interest in the library community. At the time, libraries were paying to insert book data into their online catalogs, even though it promoted the sales of books.

Three major version upgrades were performed, adding features such as automatic links to related book websites, HTML templates and a stylesheet to standardize the appearance, a WordPress ‘wizard’ to preview the display, and COinS to integrate with external book services like Zotero and OpenURL resolver. I published a second article (pdf) in NISO.

As an open source product, OpenBook enjoyed lively growth in new directions. A Drupal version was created. I was contracted by BookNet Canada to develop a similar plugin for their book repository; BNC BookShare continues to be maintained today. The OpenBook code was posted to GitHub and has been branched for enhancement.

OpenBook has had influence outside the technical sphere. In my initial design I considered using OCLC’s WorldCat as a data source. OCLC is a non-profit serving the library community, so it seemed a good fit. I hesitated because only librarians could add or edit records. As I dug further, I found the OCLC business model appeared to own the data, i.e., not an open data source like Open Library. My assessment was correct. In 2009 OCLC updated its data license to tighten its ownership. The library community exploded. An article in the Guardian asked why you cannot find a library book in your search engine, and explained that it had much to do with OCLC’s closed approach with library records. The article contrasted the closed approach of OCLC with the open approach of Open Library, and mentioned “a plug-in for WordPress that lets bloggers automatically integrate a link to the Open Library page of any book.” <blush>

An online search shows that OpenBook has been cited in three books for librarians:

  • Jones and Farrington (2013). Learning from Libraries that Use WordPress: Content-Management System Best Practices and Case Studies.
  • Jones and Farrington (2011). Using WordPress as a Library Content Management System.
  • Stuart (2011). Facilitating Access to the Web of Data: A Guide for Librarians.

In a moment of inspiration a few years ago I envisioned a cloud service evolution of OpenBook, with adapters to multiple content management platforms and data sources. This new OpenBook cloud service would remove the tight coupling with WordPress and Open Library, truly liberating book data. There was an immediate positive response when I blogged about the idea. Alas, time.

I decided to sunset OpenBook. After two years of inactivity, the plugin was automatically dropped from the WordPress search index. Recently I have been writing on the subject of book covers and peeked at OpenBook’s status. WordPress reports 600+ active installs. Nice. I took a few minutes to test the plugin’s compatibility with the current version of WordPress. Everything tested positive. I updated the plugin’s version numbers and republished the code. OpenBook is again available in the WordPress plugin search index.

Lila 0.2 – Compute Word Counts for WordPress Posts and Corpus

I have just coded Lila 0.2, as shown in the figure. The code is PHP and Javascript and uses the lovely Google Charts library. The code is available at my Github Gist site. I applied to the current set of posts in my After Reading project blog.

In the chart,

  • The blue line shows the count of words in each post ordered in sequence. For example, post 21 has 1370 words.
  • The grey line shows a linear trend — the word count per post is increasing as the series progresses.
  • The red is a constant, the average words per post. The word count for post 21 is larger than both the trend and the average.

The code is just a beginning. Many more metrics will be added to analyze the text of a corpus. I want to be analyze the style of the posts, and several word measures can be calculated: frequency, feeling, concreteness, complexity, etc. Together they profile the style of posts and can be used to compare to the corpus. Even more interesting, it builds a platform for computational understanding of a text. More to come.

Lila – Cognitive Technology – User Interface Wireframe

Lila is a “cognitive” technology, i.e., natural language processing software to aid with reading and writing. It is initially intended to analyze and improve essays in a corpus. Below is a wireframe for a user interface, comparable to to Voyant Tools by Stéfan Sinclair & Geoffrey Rockwell.

Lila has unique functions:

  1. On a Home screen a user gets to enter an essay. Lila is intended to accept the text of individual essays created by a writer. An Analyze button begins the natural language processing that results in the screen above. The text is displayed, highlighting one paragraph at a time as the user scrolls down.
  2. The button set provides four functions. The Home button is for navigation back to the Home screen. The Save button allows the user to save an essay with analytics to a database to build an essay set or corpus. The Documents button navigates to a screen for managing the database. The Settings button navigates to a screen that can adjust configurations for the analytics.
  3. The graph shows the output of natural language processing and analytics for a “Feeling” metric, an aggregate measure based on sentiment, emotion and perhaps other measures. The light blue shows the variance in Feeling across paragraphs. The dark blue straight line shows the aggregate value for the document. The user can see how Feeling varies across paragraphs and in comparison to the whole essay. Another view will allow for comparison of single essays to the corpus.
  4. The user can choose one of several available metrics to be displayed on the graph. See list of metrics below.
  5. All metrics are associated with individuals words. Numeric values will be listed for a subset of the words.
  6. Topic Cloud. A representation of topics in an essay will be shown.

Metrics:

    • Count. The straight count of words.
    • Frequency. The frequency of words.
    • Concreteness. The imagery and memorability of words. A personal favourite.
    • Complexity. Ambiguity or polysemy, i.e., words with multiple meanings. Synonymy or antonmy. A measure of the readability of the text. Complexity can also be measured for sentences, e.g., number of conjunctions, and for paragraphs, e.g, number of sentences.
    • Hyponymy. A measure of the abstraction of words.
    • Metaphor. I am evaluating algorithms that identify metaphors.
    • Form. Various measures are available to measure text quality, e.g., repetition.
    • Readability by grade level.
    • Thematic presence can be measured by dictionary tagging of selected words related to the work’s theme.

 

The intention is to help a writer evaluate the literary quality of an essay and compare it to the corpus. A little bit like spell-check and grammar-check, but packed with literary smarts. Where it is helpful to be conscious of conformity and variance, e.g., author voice, Lila can help. It is a modest step in the direction of an artificial intelligence project that will emerge in time. Perhaps one day Lila will live.