How to Use Microsoft Excel to Analyze Free Text Surveys, Feedback and Complaints

text-analytics-using-excel

Using Microsoft excel to analyze numerical or well-categorized data is relatively straightforward. It might not be easy necessarily, but at least you normally know exactly what to do. If you have ever been faced with open-ended text responses, perhaps from a survey, emailed questions or feedback forms, you know how tricky it can be to make sense of it.

The problems are many. Non-standard formatting, having to manually read each response to understand its content, variable length, and those are just the first that come to mind.

What we need is some way to drill down automatically to see if there are any common patterns, and therefore have an immediate starting point to start interpreting the responses.

A Potential Solution

Say we had a spreadsheet containing all the website feedback from the last few weeks.

What if we could count occurrences of certain words or phrases? We could then see if there were issues most prevalent on the minds of our customers.

First of all we need to make a list of all the words contained in the feedback. We can do this using a hash table or “Dictionary”. This is like an array, but instead of the items in the array being held against a number they instead are indexed using a keyword. Using this system we can store a word, such as the word “Data”, and then count how many times this word occurs.

By “Splitting” the text in the cells wherever a space is found we can crudely grab all the words used. Obviously this is not a pure approach; there will be badly formatted responses or other delimiters that would need to be cleaned. For our purposes though it will work well enough.

First Results

As you can see, while this approach is encouraging in that it worked as intended, the results are not yet useful. We need to exclude superfluous words and get to the “meat”. The traditional approach to this kind of indexing issue is to compile a “stop list” of words and phrases that can be safely ignored.

Armed with this list we ought to be able to now generate a clean set of data from which we can begin to draw conclusions.

Full Solution

So we have learned that for a full solution we need to create a macro that goes over the response text in our first data sheet and it should create a list of words, taking care to ignore words we determine to be irrelevant, listed in a “Stop List” sheet. Then for each word our macro needs to count how many times each word appears. This word list and associated word count should appear in another sheet we will call the “Issue List”. That list should be sorted in descending order so we can clearly see which issues are most important to our customers.

Here is what our data set looked like when we ran the macro against a set of feedback. We had to run it a couple of times to build up our specific stop list. I imagine this will be a common theme when you try it yourself.

Probably not surprising that in this particular result we see Data and Excel score highly, but I am intrigued by the issues around the Security responses.

The Finished Word Counting Macro

Here is the completed macro so you can use it in your own projects.

Sub WordCount()

    Dim vArray As Variant
    Dim lngLoop, lngLastRow As Long
    Dim rngCell, rngStoplist As Range

    With CreateObject("Scripting.Dictionary") 'declaring object array, Items can be any form of data, and are stored in this object array.
        Worksheets(1).Activate
        For Each rngCell In Worksheets("Sheet1").Range("A1", Cells(Rows.Count, "A").End(xlUp))
            vArray = Split(rngCell.Value, " ")
            For lngLoop = LBound(vArray) To UBound(vArray)

            If Application.WorksheetFunction.CountIf(Sheets("Stoplist").Range("A1:A" & Sheets("Stoplist").UsedRange.Rows.Count), vArray(lngLoop)) > 0 Then
            Else
                If Not .exists(vArray(lngLoop)) Then 'if the word is not in the array then its true.
                    .Add vArray(lngLoop), 1 'adds the word to object array.
                Else
                    .Item(vArray(lngLoop)) = .Item(vArray(lngLoop)) + 1 'if the item or the word already exist in the array it will just increase the keys which is the number of word count.
                End If
            End If
            Next lngLoop
        Next rngCell

            Worksheets("Issue").Range("A2").Resize(.Count).Value = Application.Transpose(.keys) 'this line shows the retrieval of the word count stored in the array.
            Worksheets("Issue").Range("B2").Resize(.Count).Value = Application.Transpose(.items) 'this line shows the retrieval of the word added in the array.

    End With

lngLastRow = Worksheets("Issue").Cells(Rows.Count, 1).End(xlUp).Row
Range("A1:B" & lngLastRow & "").Select
    ActiveWorkbook.Worksheets("Issue").Sort.SortFields.Clear
    ActiveWorkbook.Worksheets("Issue").Sort.SortFields.Add Key:=Range("B2:B" & lngLastRow & ""), _
        SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
    With ActiveWorkbook.Worksheets("Issue").Sort
        .SetRange Range("A1:B" & lngLastRow & "")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
Worksheets("Issue").Activate
Cells(2, 1).Select
End Sub

Above, we explored how we might find the customer service issues hidden within the masses of text we receive on a daily basis through an online feedback form. Below, we will take a look at what we could do next once we know what the main customer issues are.

 Acting on the Data

When we start to see trends forming that is when we can build categories and decide a system of actions for those categories. For instance, we might push certain new messages to our support departments to be expedited automatically, while other less pressing issues can sit in a queue for a little longer.

When we have a ‘security’ issue come through we want to send them quickly to be handled by our ‘security specialist’. If a message contains a request that deal with writing Macros we would want one of the developers to help with that.

Does this sound like something you would like to use?

Processing the Results

What we need to do is have a button that launches our macro (download link) and processes the feedback messages.

When the button is hit we will check for various categories (on a second sheet) and list these categories in the next column:

The Macro

Sub WordCount()

    Dim vArray, WordIssue, ElementCounter As Variant
    Dim lngLoop, lngLastRow As Long
    Dim rngCell, rngStoplist As Range

    ElementCounter = 2 'setting a default value for the counter
    Worksheets(1).Activate
    For Each rngCell In Worksheets("Word").Range("A3", Cells(Rows.Count, "A").End(xlUp))
        vArray = Split(rngCell.Value, " ") 'spliting the value when there is a space
        vrWordIssue = ""
        ElementCounter = ElementCounter + 1 'increases the counter every loop
        For lngLoop = LBound(vArray) To UBound(vArray)

            If Application.WorksheetFunction.CountIf(Sheets("Issue").Range("A2:A" & Sheets("Issue").UsedRange.Rows.Count), vArray(lngLoop)) > 0 Then
            'this is to test if the word exist in the Issue Sheet.
                If vrWordIssue = "" Then
                    vrWordIssue = vArray(lngLoop) 'assigning the word
                Else
                    If InStr(1, vrWordIssue, vArray(lngLoop)) = 0 Then 'a binary of comparison
                        vrWordIssue = vrWordIssue & ", " & vArray(lngLoop) 'this will concatinate words issue that exist in Issue Sheet
                    End If
                End If
            End If

        Next lngLoop

        Worksheets("Word").Range("B" & ElementCounter).Value = vrWordIssue 'entering the final word issue list into cell.
    Next rngCell

End Sub

How it Works

The macro starts by creating a word list for each cell in the list of feedback messages by splitting the cell contents whenever there is a space. This creates an array in memory that we can loop through to test if the word exist in the “Issue Sheet”.

If we find a word that we want to check for, we assign the word to another list for later insertion into the final issue column.

Summary

As you can see, the output of this is we take unstructured data and produce something we can actually work with. That’s huge. This technique can be used for much more than the simple example we demonstrated here but I hope it gives you some inspiration for how you can process the text in your work.

Author: Yoav Ezer          Source: Windows Guides      Picture: Udemy.com

 

Profile Status
ACTIVE
Profile Info
 

Kalyan Banga205 Posts

I am Kalyan Banga, a Post Graduate in Business Analytics from Indian Institute of Management (IIM) Calcutta, a premier management institute, ranked best B-School in Asia in FT Masters management global rankings. I have spent 6 years in field of Analytics.

1 Comment

  • Vedic Astrology Framework & Contemporary Data Science Paradigm - Fusion Analytics World Reply

    November 6, 2016 at 4:10 am

    […] This survey will briefly describe the framework of Vedic astrology using terminologies and examples that are easier to understand in our modern ears. Most of the seminal texts are written in Sanskrit and one of the difficulties is that most of the current practitioners have never studied the original books in Sanskrit and thus they rely on second, third or even fourth hand translations or commentaries! Another problem is their communication aspect of their framework and a non-exposed person will easily conclude that it is some voodoo mumbo-jumbo. […]

     

Leave a Comment

18 + nine =