10 Text Mining Ideas

Share on LinkedIn1Tweet about this on TwitterShare on Google+0Share on Facebook0

I’ve been reading the book Thinkertoys by Michael Michalko, a collection of creativity and innovation tools to use when generating ideas.  One of the hundreds of ideas the author offers to get new ideas is from your junk email.  He suggests that by scanning your junkmail, you can get some ideas for what is hot or emerging.  It’s an interesting idea that I tested by putting about 100 emails into a free text visualization tool called Wordle, which creates simple word clouds based on each word’s frequency (thanks to my colleague Amaresh Tripathy for the tip).  Honestly, the results were less that inspiring, but it did start me thinking what else could be learned from this simple word analysis - a basic (and free) form of text mining.

Mining IT Leaders Search Terms

The basic capabilities of WordPress combined with hundreds of free plugins make it a killer platform for writing a blog.  One set of features is tracking ans reporting.  I thought it would be interesting to see what kinds of things people searched for that directed them here, to the CIO Dashboard blog.  Not only would it satisfy a general curiosity, it could be a good way to learn about things that people want to read more about.  Here’s the first pass at feeding Wordle the top search terms for the last 3 months (just cutting and pasting from WordPress’ stats into Wordle):

I guess it’s no surprise that CIO and Dashboard are the two most frequently searched terms that land people here.  I did learn something though - that priorities are pretty important.  Hmmmm.  For a second pass, I removed all of the dashboard related terms and ran it again.  Here is the result:

Additional interesting things emerge like blueprint, lean and service.  I haven’t spent much time at all on any of these topics but maybe I will now.

10 Text Mining Ideas

Here’s some simple brainstorming to get you started in applying this kind of thinking to your businesses.  See if there are some piles of text lying around that could use some more analysis and drop it into Wordle to see if you can spot trends or hot spots:

  1. Call center CSR note fields
  2. Notes and/or special instructions in online order forms
  3. Emails to and from a particular vendor
  4. Emails between internal helpdesk and customers
  5. Unsolicited emails from vendors
  6. Project status reports - one project, an entire program, all projects
  7. Blogs - internal blogs, blogs on a certain theme
  8. Articles from industry/functional web sites - on a certain topic or topics
  9. System or application error logs
  10. Bug and/or enhancement lists

Obviously this is a very simple approach and has some holes - words vs phrases (which Wordle can handle by separating words with a ~), synonyms, etc. - but it’s simple and free.  Some basic prototyping might help build a more sophisticated business case for deeper and more visual text mining.  Let me know what you think.

Share on LinkedIn1Tweet about this on TwitterShare on Google+0Share on Facebook0
  • Pingback: Tweets that mention 10 Text Mining Ideas: I’ve been reading the book Thinkertoys by Michael Michalko, a collection of creativity and ... -- Topsy.com()

  • Very interesting thoughts, especially around the use of call center notes, etc. to help generate content for blogs and other marketing material. One (obvious!) limitation to Wordle is that it scans only for frequency and cannot perform the critical human function of consolidating and linking different thoughts into trends or insights. However, it can through visualization help you make those critical connections.

    One, lower-tech flavor of data mining is to ask anyone coming in from a trade show or customer call “What’s new?” If what they tell you over the cube partition (“Man, it seems like people are optimistic again” or “Boy, the latest upgrade to competitor XYZ’s product is crashing like hell”) is interesting to you, it probably will be to your customers. Don’t let that information hang there — blog on it and promote it.

    Very interesting ideas and thanks for getting this discussion going;

    Bob

  • Very interesting thoughts, especially around the use of call center notes, etc. to help generate content for blogs and other marketing material. One (obvious!) limitation to Wordle is that it scans only for frequency and cannot perform the critical human function of consolidating and linking different thoughts into trends or insights. However, it can through visualization help you make those critical connections.

    One, lower-tech flavor of data mining is to ask anyone coming in from a trade show or customer call “What’s new?” If what they tell you over the cube partition (“Man, it seems like people are optimistic again” or “Boy, the latest upgrade to competitor XYZ’s product is crashing like hell”) is interesting to you, it probably will be to your customers. Don’t let that information hang there — blog on it and promote it.

    Very interesting ideas and thanks for getting this discussion going;

    Bob

  • Pingback: Twitted by sharon_elshaug()

  • Chris -

    How about we scrape all the text from the CIO and CTO Dashboards and jam that into Wordle? Then, we can cross reference by industry. That might be an interesting way to see what topics are currently top-of-mind for other IT leaders …

    Thoughts?

    LG

  • Chris -

    How about we scrape all the text from the CIO and CTO Dashboards and jam that into Wordle? Then, we can cross reference by industry. That might be an interesting way to see what topics are currently top-of-mind for other IT leaders …

    Thoughts?

    LG

  • Good stuff. I can’t wait until we have collaborative social media for knowledge. Something to scrap the external landscape for CXM and IT issues.

    Thanks for the post.

    Chris

  • Good stuff. I can’t wait until we have collaborative social media for knowledge. Something to scrap the external landscape for CXM and IT issues.

    Thanks for the post.

    Chris