Editor’s note: Tony Siciliano is managing director of International Interviewing, a White Plains, N.Y., research firm.

Having spent a lot of time in international research, I’m well aware of the problems in achieving comparable cross-cultural test results. The problems are particularly severe with attitude scales, and understandably so. The positive scale skewing in Latin American countries is legendary. A product or ad would have to be absolutely awful to get below a 60 percent "Top-Two Box" score on a five-point scale. It’s not that Latin Americans are dishonest - it’s just that they don’t want to hurt anyone’s feelings.

It took a number of years of living in and traveling to France before I could emotionally accept that pas mal ("not bad") was almost the equivalent of an American "extraordinary." So, "somewhat interested in buying" to a Frenchman would be a much stronger commitment than to an Englishman, since "somewhat" has a less enthusiastic connotation to the English.

The five-point purchase intent scale is probably the most-used market research scale, both domestically and internationally. I’ve always suspected that this scale was only valid when there were substantial differences between test variables - and that it lacked precision when there were subtle (but possibly significant) differences. I had an ideal opportunity to test this theory when conducting a product test for the leading chewing gum in France.

The brand had close to a 70 percent share and a pricing decision had to be made when the cost of chicle rose significantly. The client was sure the competition would not raise prices if his price went up because this would afford them an excellent opportunity to erode the brand’s enormous market share. Instead of raising prices, it was decided to reduce the standard 11-stick pack to 10 sticks.

A product test was conducted with two cells: 11-stick pack at the current price and 10 sticks also at the current price. I convinced the client to include a "partial-payment coupon" measurement as the very last question. (With the partial-payment coupon, respondents are told they can select a coupon worth one-third the purchase price for any brand of gum. This technique simulates the actual buying experience, because when respondents make their choice, it’s with the knowledge that they will also have to lay out some of their own money. [It’s not necessary to have "coupons." After respondents make their choice, they are given cash.])

The rationale was that it could do nothing to bias the standard results and might uncover a problem not detected by the standard questioning. As it turned out, neither the five-point purchase intent nor any other standard measurement revealed there would be a problem in reducing the number of sticks and keeping the current price. The partial-payment coupon, however, revealed this would be a dangerous move.

Of course, the drawback with this technique is that it can only be used with purchase intent - and with only reasonably priced products.

Magnitude estimation

My first research exposure to magnitude estimation was through an international project I coordinated for Ambrosino Research, Inc., White Plains, N.Y. Of all the scaling techniques I’ve had experience with, this appears to be the most valid because it’s both logical and realistic. And flexibility is yet another advantage, particularly in multinational research.

The underlying principle of magnitude estimation is that respondents create their own scale parameters. This can be one to 10, one to 50, one to 100, or one to "whatever." Of course, this requires a certain amount of respondent training. But the actionable results emanating from this innovative technique are worth the additional effort.

In the training session, respondents are given an explanation of the validity of creating one’s own scale. Anecdotes illustrating the use of magnitude estimation are very helpful in creating an understanding of the logic behind it. Respondents are then given a trial run by using this scaling procedure with a product totally different from the test product (e.g., if the test product is a food that’s tasted, the trial run product would be a detergent that’s sniffed).

Consider the dilemma researchers are faced with when asked to develop an ideal attribute profile for a product or service category. The dilemma is not in determining which attributes comprise the ideal but in knowing the relative importance of each attribute. Standard scaling techniques usually blur that all-important difference in degree. Magnitude estimation brings precision in discerning differences, if in fact, there are real differences.

As I mentioned above, I used magnitude estimation when I coordinated a large-scale taste test for a line of processed food products in Canada and Puerto Rico. This was an excellent multinational proving ground for this technique since three distinct cultures were represented in the total sample - French-Canadian, Anglo-Saxon Canadian, and Puerto Rican.

There was the French/Anglo-Saxon problem in equating their differing "somewhat" assessments, as well as the Latin American positive skew bias. This meant that the marketing question that was presented was compounded by this culture diversity problem.

The client had several geographically-dispersed plants to produce their line of food products - and each had its own recipe to reflect the tastes of their respective regions. It was decided that a standardized recipe for all these plants would provide greater flexibility in the production and distribution of their food lines. The question that research had to answer was, Would standardized recipes be favorably received across all geographic areas?

We were able to demonstrate unequivocally that standardized recipes would be acceptable across all regions. And what proved to be particularly interesting was that there were no consistent regional preferences for the products made in the consumer’s home region. In fact, their "home" product often lost to another region’s product.

Level the field

Cultural differences were ameliorated since each ethnic sample was not burdened with the semantic biases of the other two. Magnitude estimation was able to level the scaling playing field for these three culturally diverse samples. I will continue to recommend use of this innovative scaling technique in all multinational research assignments.