Evaluation knives and bias

Joined
Nov 16, 2002
Messages
9,948
Some of the Bladeforums folks have received knives from makers and/or companies for the purpose of testing them out to find their strengths and weaknesses. A lot of the passarounds, in the passaround forum and in private, have been conducted at great expense to various makers and manufacturers and that's been pretty cool for all involved, in my opinion.

STeven Garsson has recently asked the decent question of whether or not receiving free knives affects the objectivity of the person performing the evalutation. Ruining Gunmike1's thread to do so wasn't decent, but good questions are good questions.

Whether free, retail, or discount, there's always going to be a selection bias involved. It could be visceral reaction to the knife's appearance; it could be because it's from a trusted maker; or it could be from the materials used.

In my own case, I was recently biased towards the looks, construction and features of Kershaw's ZT0500 before receiving one and a ZT0200 in a pass-around. Don't get me wrong, I'm still floored by that knife, but the ZT0200, which I never would have chosen on my own (Liner lock? Planters Peanut guy in a ninja-suit handle shape? Recurve?), but then using/holding/carrying the knife made me favor it much more over the ZT0500. Had a similar situation (not involving passaround knives) in that my ZDP-189 Spyderco Caly3, reground thinner than sin by me, kept binding in double-thickness cardboard (lawnmower box) whereas my Kershaw JYDII in 13C26 and a thicker grind (also reground, but thicker than the ZDP Caly3 thingy) flew through it with ease. I would've thought that a thicker edge would've meant the need to use more force (like it does in regular thickness cardboard and whittling hardwood), but less was needed to make the cuts. I'll still pick the Caly3 for edging the lawn, though.

Another problem with trying to be objective is in, well, trying. The evaluator may have found his or her perfect knife for performing certain tasks, but feel like a total fanboy if his or her review doesn't have enough negatives listed and add downsides which weren't really there.

What are the thoughts of the other knife reviews junkies?
 
Everyone and all reviews are bias. To me the problem is when people try to pretend they are robotically objective. I do think it is ok as long as you don’t hide any facts and honestly report your opinions. Yes even if all your opinions are positive and no negative. There is really no need to balance them if that is how a person really feels.
 
Reasonable people will always be reluctant to fault a blade either when they made the purchase choice and already voted with their cash, or someone gave it to them to test with high hopes. The objectivity can be read in the way the results are presented, so it's best to ask questions and read between the lines. Faint praise for a high dollar knife is an example. By comparison, I think many of the legitimate reviews are balanced. Compare them to most reviews in the gun mags which are appallingly shameless.

"Good for the money" is not a valid conclusion in a test, since cost is a matter for the consumer to decide. The only useful factors are performance, appearance, and service. Even in the "Overpriced" thread, no one can put their definitive finger on it. Value is way too subjective in a test unless unexpected total failure renders the thing worthless.

As an aside, the gun mags usually excuse it by saying "I called the factory and they said 'the new ones have been fixed':rolleyes:" or blame it on baggage handlers. :) Regards, ss.
 
I have read a lot of the reviews and have written a couple of assessments about knives that I have purchased.

My opinion is that bias such as STeven postulates may allow the evaluator to gloss-over minor defects or minor personal preference issues. My further opinion is that such bias does not over-ride major defects or major issues.

As db stated, all reviews are somewhat biased because so much about the ballance and feel of a knife are specific to an individual. Likewise no two people use a knife exactly the same way. When I read two vastly different evaluations of a knife or a steel, I do not assume one is lying. I assume the usage is different, or the edges are different, or there was a QA issue on a specific knife, or they are using different evaluation criteria. Then I try to understand the differences.

Knarf
 
... whether or not receiving free knives affects the objectivity of the person performing the evalutation.

It can, it depends on the person, it however is completely trival to prevent this by any number of means I have detailed in the past. It is easy to ensure you are being objective if you wish to. Further more if you combine the results of 30 trials, even if tehy are subjective and you get a average result, then this by definition will be objective, assuming the sample is representative.

A more significant problem is could makers be introducing a bais by giving a non-representative sample. Either by hand picking an above average knife or actually intentionally giving a defect steel as a baseline reference for something they wish to promote. This is the main reason it is not done in general. One would assume that at a minimum people would give the knives a glance over to make sure there are no gross problems. However this is not an impossible problem to solve and quite frankly there are limits to what can be achieved anyway.

Lets assume for example a maker has a new steel that he really wishes to promote and he does something really drastic and provides the user group with his own 154CM blade for reference which he heat treats intentionally with a blown grain size. Well there is nothing saying that I will not get another 154CM blade and heat treated by Bos for example for another reference. Now all the maker has done it shown incompetence in heat treating 154CM. I have made it obvious I will be doing all of these things and thus I do not see them as significant issues.

Of course the people using the knives will be comparing them to knives they have bought, owned, borrowed, etc., so this is not a tunnel vision comparison anyway and again this makes the influence of bias from makers very difficult because the other experiences will always be included.

... all reviews are somewhat biased because so much about the ballance and feel of a knife are specific to an individual. Likewise no two people use a knife exactly the same way.

You can not judge data biased in such a manner, a bias is dependent on the conclusions reached. If for example I ask all Bladeforums members a question about knives then this is not a representative sample for all knife users in general, the data is therefore biased. However if I am actually interested just in Bladeforums readers then the sample is unbiased and representive.

-Cliff
 
Reasonable people will always be reluctant to fault a blade either when they made the purchase choice and already voted with their cash, or someone gave it to them to test with high hopes. The objectivity can be read in the way the results are presented, so it's best to ask questions and read between the lines. Faint praise for a high dollar knife is an example. By comparison, I think many of the legitimate reviews are balanced. Compare them to most reviews in the gun mags which are appallingly shameless.

"Good for the money" is not a valid conclusion in a test, since cost is a matter for the consumer to decide. The only useful factors are performance, appearance, and service. Even in the "Overpriced" thread, no one can put their definitive finger on it. Value is way too subjective in a test unless unexpected total failure renders the thing worthless.

As an aside, the gun mags usually excuse it by saying "I called the factory and they said 'the new ones have been fixed':rolleyes:" or blame it on baggage handlers. :) Regards, ss.
And yet those of us who have more than one knife tend to "find faults" when we decide to use one mor than the others, right?
:D
 
Hi, Thom. IMO bias of some sort and degree is inescapable under informal conditions like we usually see in tests and reviews posted here. However I also think that this is apparent to everyone, and most testers/reviewers give us enough information that we usually can see where bias might be creeping in.

Personally, I like to believe that numerous mistaken opinions I've formed about knives and steels in the past have made me at least somewhat aware of my own tendencies towards bias. This is a primary reason why, when doing edge retention tests, I run a number of different knives at the same time. After you've seen $10 Moras outperform much more expensive blades, and had so-called "premium" steels fall way short of their hype as often as I have, it sobers you up to the fact that if you're not doing objective comparisons, all your work is probably just contributing to your own ignorance. :)

I just wish that the issue of bias could be discussed intelligently without deteriorating into the kind of personal pissing matches we've seen.
 
Discuss the methods independently of the people, talk about what should be done and how and make this application uniform to everyone. Data bias is in fact no more or less a problem with analog or digital equipment and science is a lot older than computers. The equipment that you have in your house right now is a lot more advanced that what people were using for experiments not too long ago. It was not that far in the past that people used their pulse as a timekeeper. This is no way prevented the advance of science or made the work more or less biased.

-Cliff
 
STeven Garsson has recently asked the decent question of whether or not receiving free knives affects the objectivity of the person performing the evalutation. Ruining Gunmike1's thread to do so wasn't decent, but good questions are good questions.

Whether free, retail, or discount, there's always going to be a selection bias involved. It could be visceral reaction to the knife's appearance; it could be because it's from a trusted maker; or it could be from the materials used.

1. Cliff Stamp "Cliffjacks" threads frequently, and you don't say dick about that, Thom Brogan, so don't you be getting all high and mighty about my "ruining" Gunmike's thread. Until you Stampies, and you being number one with a bullet, being the "good twin" Thom, show a better effort in trying to reel in Cliff's unsavory manners.....I'll be kicking back 2x.;)

2. I was prone to taking Cliff as presented, without an agenda...with the recent revelations of "freebies"....I am not so sure....I believe that Cliff may have a pro-Busse, and definitely pro-Spyderco agenda.....I could be wrong, but in reading his many posts, this is sort of showing itself as a possibility.

Best Regards,

STeven Garsson
 
jdm- every time I buy a knife it's with the excuse it's the last one I'll ever need. You guys just keep making them better, then "more better". I think someday I'll start a thread about why I'm going back to 440C and just let someone else find the grail. :D ss.
 
My big problem (which, by the way has led me to put some people, or person on ignore) is the rampant misinterpretation of experimental results. These results are commonly used in these forums to make flawed generalizations of entire steels, and companies--even worse they have been presented as facts when in reality they only reflect the properties of the knives "tested".
Unfortunately most people don't take doctoral level seminars on research design, otherwise such assertions would be more widely recognized as the fluff that they are.
 
davemgt,

I hope that most readers here know that what works for one writer may or may not have anything to do with those readers' preferences and needs. Should they choose the authority of one writer over another without critical thinking, then we need people with different viewpoints to become better at writing.
 
My big problem (which, by the way has led me to put some people, or person on ignore) is the rampant misinterpretation of experimental results. These results are commonly used in these forums to make flawed generalizations of entire steels, and companies--even worse they have been presented as facts when in reality they only reflect the properties of the knives "tested".
Unfortunately most people don't take doctoral level seminars on research design, otherwise such assertions would be more widely recognized as the fluff that they are.

Nonsense, that is absurd. No really Dave , nice post - I just wanted to see if it would give me some feeling of superiority to respond like a jackass. I do disagree with one thing - I don't think it takes a doctorate to tell, & that is why I wonder why so many are apparently buying into the unified cutting theory.

And from a nonscientific basis, one person regularly questions peoples motives, honesty, and/or integrity here. No one objects to that, except a very few people. Now Steven does it, and it is now an issue. Thom, you're one of the nicest guys around, but STeven has a good point.

Bias is when you cannot accept any testing unless it uses your model as its basis. Bias is when you take experimental data and think it is good science to only look at one best fit line, based on a theory that you insist is based on fundamental principles, but obviously is not.
 
STeven has a good point.

He makes several good points. I even credit him with the one I stole to make this thread. I just Garssonized the idea until it came out how I wanted.

Dog of War,

I know what you mean, but I've got the $9 and $12 Moras from Frosts and Ericksen, so, like, my frame of reference is skewed. When a $10 knife outcuts a $150 knife, it's a little eye-opening, but when an $11 knife does, too, well, it's too close to truly notice.
 
These results are commonly used in these forums to make flawed generalizations of entire steels, and companies--even worse they have been presented as facts when in reality they only reflect the properties of the knives "tested".

It is never the case (well rarely) that people study populations. If you look for example in the ASM reference works you will see steel charts which are obviously generated from individual samples and extensions are made to the steel in general because it is assumed the bar was representative. The confidence on this is obviously simply the variance in the steel.

If for example you use a knife from Cashen and he has a quality control of 5% (less than 5% are defective, it is in reality probably lower) then you can be confident that a report is representative with 95% probability. Results are commonly published with that confidence so in general the only arguement for being hesitant to draw conclusions would be if you also argue that the QC is so much higher on the relevant knives.

Now with cheap knives people are quick to point out there is a lot of variability. Some opinels come with no edges, some are sharp, some even have differences in the primary grind. But on the other hand some knives like the Sebenza are known for extreme consistency in fit and finish and thus a random knife would have a very high probability of being representative. Directly, the higher sample you dictate is needed, the worse you are saying the QC of the maker/manufacturer.

Of course the reason that there is so much focus on certain results is simply due to the lack of comparative data. On Bladeforums, out of every post per day how many deals with direct performance comparisons of knives? Because this percentage is so small any such data stands out strongly and becomes the focus point of discussion. If you want a more generalized conclusion then start providing data of your own and/or encourage others to do so.

-Cliff
 
1. I am sure the ASM tests a lot more than 1 bar of steel before casting judgment on the whole class. Further, I am sure their tests are much more standardized and objective, done under strict controls, etc.

2. I think Cliff is not using the term QC properly. 5% does not mean that 5% are defective (i.e. gross issue) but rather deal with variences. Using your number of 5%, if you had a knife that was 5% harder or otherwise with a 5% better heat treat, primary grind 5% thinner, 5% better thinner edge, 5% better fit and finish, etc, you may be talking about a different knife than someone else. (Some argue that ergonomics are subjective, others try to quantify and argue it is objective, I know small changes can hugely effect comfort in use. As to its classification, I would say ergonomic, like aesthetics, is subjective in nature, notwithsdtanding that some aspects can be quanitified. What is "beautiful" can be quantified as well, i.e. highly corrolates to symetery, etc.)

3. Whether a person likes a knife, whether it peformed "good" may not be related to any objective factor, but may be related to simply personal preference issues, or even the image related- the users experience may be related to marketing hype. (My knife cut through that rope, and those cardboard bxes really sweet, no wonder it is used by the Navy Seals and Army Rangers.)

4. In general some comparison is helpful to place things in context. While a person can come out and say: "I have used several knives including blah, blah blah. This Mora 2000 slices through clothes line and beef brisket better than my Gerber Yari, but not as well as my AG Russell Deerhunter" such comparison can also be made by noting the person making the comments, i.e. if Joe Talmadge writes that a knife cuts realy well, or OwenM says that a knife can withtand incredible abuse, you don't need direct comparison. The authority and familiarity of the writer is enough. Of course this is easier with smaller forums, just think about this place a few years ago.
 
1. I am sure the ASM tests a lot more than 1 bar of steel before casting judgment on the whole class. Further, I am sure their tests are much more standardized and objective, done under strict controls, etc.

The bars are specific composition measured, they do not test multiple batches to judge the spread influence. I thought this was kind of surprising until I realized just how much work it takes to do even one steel properly. It took Landes four months for his edge stabiilty measurements and even then he is only looking at a very narrow sample.

Basically you do not look at the steel itself, that is a problem many have, you look for patterns in composition and then draw generalities. Landes for example correlated edge stability to aus-grain, carbide volume and hardness. Based on that you can predict that F2 has a much higher edge stability than D7.

I think Cliff is not using the term QC properly.

In management of quality you are free to set your variances as tight or as loose as you want, most will refer to the defect/replacement rate. My point was to make this discussion less idle handwaving, let us set a figure on the amount of knives which would be nonrepresentative, this is the exact percentage that you would use to bound the confidence of your report on the knife.

You can use a sample of one when you know the population variance, or at least make inferences. Unless you think that high end knives suffer from large percentages of nonrepresentative samples then you can put high confidence in individual reports. This does not mean they are treated as gospel, even in journal articles the publishing error is often around 5% which means that 1/20 times the correlation is just random.

Now of course few people are so restrictive, most people talking about knives draw their other experiences into the mix and as well other data to prevent even the noted defects from causing misinformation. For example, I have seen a lot of defects from Ontario with 1095, however based on the material properties of the steel I know it is not the fault of the steel just the heat treatment.

An open discussion forum where others can share experiences will also prevent extreme skewed viewpoints as counter examples and maker/manufacturer feedback can be given. Thus for example when someone complaints about a Strider folder having poor fit/finish you can quickly do a search and see if this is common and the responce from Mick. Compare the results of this vs someone having a similar problem with a Sebenza.

-Cliff
 
Back
Top