Lila 0.2 – Compute Word Counts for WordPress Posts and Corpus

I have just coded Lila 0.2, as shown in the figure. The code is PHP and Javascript and uses the lovely Google Charts library. The code is available at my Github Gist site. I applied to the current set of posts in my After Reading project blog.

In the chart,

  • The blue line shows the count of words in each post ordered in sequence. For example, post 21 has 1370 words.
  • The grey line shows a linear trend — the word count per post is increasing as the series progresses.
  • The red is a constant, the average words per post. The word count for post 21 is larger than both the trend and the average.

The code is just a beginning. Many more metrics will be added to analyze the text of a corpus. I want to be analyze the style of the posts, and several word measures can be calculated: frequency, feeling, concreteness, complexity, etc. Together they profile the style of posts and can be used to compare to the corpus. Even more interesting, it builds a platform for computational understanding of a text. More to come.