Crossroads

While most of my experience rests in the qualitative realm, I’m always for opportunities to gather statistically significant data. One example of this was a recent opportunity to be the sole researcher on an extensive series of card sorts involving over 1,000 users.

Completed

Project Initiation

I was approached by members of an internal team who were leading the migration of 186 articles from a child site to our main company website. While all of these articles would live on one page within our established Main Navigation (Main Nav), within its dedicated page, the Information Architecture proved to be a point of contention among the team.

Before introducing you to the differing opinions of the team, I think it’s crucial to understand the content at the center of this.

The purpose of this work was to identify a menu structure for a new addition to our main website. The content being merged came from a blog related to the company, and the style of casual and entertaining articles didn’t immediately fit within the Line of Business style menu structure we currently hosted. Our main site focuses on the sale of insurance products, and thus features a Line of Businesses (LOB) menu style, with Life, Home, Auto, Retirement, and other types of insurance. The articles to be transferred over would be hosted on a sub-page, but the question remained – would users think of this distinct style of content within the established LOB mindset of the larger website, or would they categorize the articles in a more personal way?

Initial Card Sorting

Products:
Feedback from 100+ Users

Cart Sort Dendrogram

The first data I received from the team was a list of all of the articles being brought over, totaling 186. These has been placed into two groups – the first one consisted of articles the team felt could fit into am LOB menu structure, and the second category being the articles the team felt could not fit into this style. There were dissenting opinions on the team, and these categories were a good representation of that; while 109 articles were in the LOB category, 77 cards could not. Some members of the team argued this would result in a very large Miscellaneous category, which would be so large it would mean LOB was ineffective for this content.

To test this claim, I first wanted to ensure that users felt that the LOB articles actually belonged in a LOB menu. In hindsight, I could have improved this initial testing phase by also including the Misc. cards, to get a better idea of the entire menu structure, but this did provide some good insights for me.

This first test focusing on LOB cards involved over 100 users, and was an open card sort. I opted for open, as I did for the majority of the tests, to avoid suggesting categories that could influence user’s feedback. I found that users created LOB menus with these cards, with distinct sub-categories with high agreement, such as Home Maintenance under Home. The categories most frequently created were Retirement, Life, Home, Auto, and a Resources/Misc. category. Even though these cards didn’t include the set of articles that the team designated as Misc, 18% of users still created this category. However, it was relatively small, with an average of 5 cards.

When presenting this data, I showed the team the dendrogram that resulted from the card sort data. This was another area I could have improved on, because the vast majority of the team had never seen a dendrogram. They had to both learn a new data visualization, and understand the results. There was a significant amount of confusion during this part of the presentation, and I ended up doing my best to explain the findings without the aid of the dendrogram.

By the end of the presentation, it was clear this was just the beginning of this project. The team wanted more tests run, including the Misc. category, and wanted more clarity on the sub-categories mentioned before. More importantly, some team members continued to question if LOB was the correct menu choice and wondered if the context of the site visit would change this. For example, if a user was a current insurance customer, would they organize the menu differently than if they were a non-customer who landed on the page because of a Google search? I began drafting more tests to answer these questions.

Condensed Card Sort

Products:

A and B Versions of a Card Sort (120 Cards)

Feedback from 200+ Users

2 Card Sort Dendrograms

I needed to establish how users organized the entire menu structure. However, this introduced another obstacle. My typical card sorts don’t exceed 30 cards, but to show the same user the actual menu structure, they would need to sort over 186 cards. Even if this was a moderated card sort with handpicked users and generous compensation, this would be an arduous task affected by user fatigue. These tests were ran on UserZoom using their user panel, so while we had some control over screeners to get users, we weren’t able to up our compensation or moderate these tests. In addition to this, I have seen user fatigue in card sorts before, and the resulting data from it. It’s resulted in few, large categories that have clearly been thrown together at near random. My plan for combating both the size and the presence of user fatigue was two-fold.

First, I tested the menu structure with a condensed card sort. This consisted of me condensing an even number of cards from both the LOB and non-LOB categories. An example of the condensing of cards:

  • 7 Safety Tips to Conquer Driving in Rain

  • 9 Tips for Driving in the Snow

Becomes

  • Safe Driving Tips

Totaling 33 from each side, it was important to keep the number equal. If we condensed more from the LOB side, then users may see a large number of non-LOB cards, and change their organization accordingly.

This resulted in 120 cards total – still far larger than I wanted, but as condensed I could get it while still accurately representing the real menu structure. Users were told multiple times before the actual card sort this would be a long task and thanked them for their patience. Even with this, we did see significant drop off numbers of users, but did eventually reach our user goal.

I also had to account for the issue of user context mentioned earlier. To account for this, this test, as did the ones that followed it, had both an A and B version of the same test. The only difference was the wording in the screen directly before the test, which introduced the context the user would be relying on to help them. In version A, I introduced this as the website of their current insurer, and that they had found this site through the main parent site. In version B, I introduced this a website of an insurance company they did not belong to, and they had found this task from a Google search.

Divided Card Sorts

Products:

A and B Versions of 3 Open Card Sorts (62 Cards)

Feedback from 600+ Users

6 Card Sort Dendrograms

The second type of test was another open card sort, but this time I wanted to ensure that the actual wording of articles was also shown to users. With the condensed card sort, many cards were renamed so they could be grouped together. While in theory this wouldn’t affect results, I wanted to account for the possibility that in condensing cards, I removed some of the actual user interaction the names would illicit.

This made it easy to show data directly next to each other, including but not limited to the creation rate of a Misc. category, the number of cards in this category, and the main categories created at a set Agreement level.

Agreement level was another issue to tackle when presenting this information. Of course, if something has 100? Agreement, it sounds like an ideal result. I didn’t want to present the raw dendrogram like last time, so I had to contextualize the idea of agreement for people uninvolved with UX research. To do this, I made mockups of the existing company website with a main nav representing both 0 and 100% agreement. This helped give a visual example of why neither of these was a good solution, and we would have to rely on something that balanced these two values.

First, I would randomize all articles, mixing the LOB and Misc. categories. This was an important step because it ensured that when I later divided this list, the cards that were in the resulting categories wouldn’t disproportionately represent either menu structure. The randomized list was then split into thirds, with each category having 62 cards. These were then presented to users in an open card sort, prompting them to not only name categories, but also provide a brief explanation of why they created the category. While these wouldn’t show us the full menu structure, it could show us snippets of groupings with their actual article names.

Misc. Category Card Sorts

Products:
A and B Versions of a Closed Card Sort

Feedback from 200+ Users

2 Card Sort Dendrograms

At this point in the study, I knew the main seven categories users were creating, regardless of scenario. Now, I could introduce closed card sorts, now that I knew how to name them. The first action I took was presenting users with a closed card sort of the main seven categories, which included a Misc. category, and then presenting them with the cards designated in the Misc. category by member of the team. This helped me see if there were any “easy wins”, in the regards that users were able to easily place a card into a given category. The cards that had low relations with any one category, or ended up with a high relation in the Misc. category, were the cards that we needed to narrow our focus in on.

To do this, I introduced another closed card sort, with the same big seven categories chosen by previous users, and presented the users with the “remainder” card, which hadn’t been able to be placed consistently in any category. This was the last test that I conducted, and the results here would show me which categories truly belonged in the Misc. category. Because our system doesn’t allow users to put cards in two or more groups, we also had to consider something we had been getting limited feedback about before, which is that some categories end up in Misc. not because they don’t relate to any one category, but because it relates strongly to 2 or more.

This project was an excellent opportunity for me to begin practicing my skills in the quantitative realm. In addition to this, I was able to pair statistically significant data with thoughtful user responses that also gave the study qualitative relevance. This was also an important project in regards to a particularly important aspect, which is that research is only part of the role of a UX Researcher. Working with your team, hearing all opinions, and at times even mediating conversations, is just as important as the data you end up presenting. Just as presenting incomplete data or asking leading questions would have reflected poorly on me as a researcher, so would have handling this situation differently. Hearing both sides from the team, even though I didn’t agree with them, was crucial in making myself a trusted and respected part of the team.