Bushcraft Knife Challenge results!!!

unit · Apr 6, 2011

My apologies if this was discussed and I missed it.

How was edge retention gauged? I have heard many discussions on this topic on other forums and there is wild variance to the answer to the question, "when do you sharpen?"

Would anyone care to point me to the metric used for edge retention in this test?

Thanks

dphillips · Apr 6, 2011

Guys, honestly I'm glad I'm a maker in this deal and not a tester. I learned a lot, thank Yall for doing this. Looks like ease of sharpening and sheath building got me. Lol
Donavon

tonym · Apr 6, 2011

Bruce Culberson said:
Tonym - smooth move on buying the knife before the reviews were posted

I will be happy to sell that knife to any who want it. Let the bidding start at $979. do I here $985, 985 going once....

tonym · Apr 6, 2011

unit said:
My apologies if this was discussed and I missed it.

How was edge retention gauged? I have heard many discussions on this topic on other forums and there is wild variance to the answer to the question, "when do you sharpen?"

Would anyone care to point me to the metric used for edge retention in this test?

Thanks

Good question Unit.

There was nothing scientific with any of the reviews or scores. I said in the beginning that the findings should be taken with a grain of salt. We did not go into this as knife experts and we sure as hell aren't coming out of this as knife experts. The reviews and scores are just 3 different opinions from three different guys. Nothing more, nothing less.

I can not speak for the other guys, but I felt that the limited use we gave each of these 21 knives was not enough to test edge retention. I scored everyone a 4, cause I really didn't notice much diiference from how sharp they were when I got them to when I was finished. There was a few knives I scored a 5, because they were shaving or push cut paper with ease when I got them, and they did the same when I was done.

So that's how I scored edge retention.

I am happy to answer any questions about the testing, or the scores I gave on the knives. So if any of you have any, let me know.

hollowdweller · Apr 6, 2011

Interesting. I wondered if you were going to do any side by side tests on edge retention.

Having a lot of customs the one thing I notice is there seems to be a huge difference in edge retention even when the steels remain the same.

Now the blades I have where the makers actually farm out their blades to a heat treating service, those blades seem to be all uniformly well tempered and there seems to be no difference in edge retention.

However when the maker does them themselve they seem to vary widely.

I'll take say a long 1" thick or thicker limb and then completely whittle thru it with say both knives, trying to make sure I utilize all parts of the main edge. One knife will still shave and the other, same steel and everything will not.

Perhaps some of the worst edge retention of any knives I have are the supposedly L6 ones treated by the maker.

Big Mike · Apr 6, 2011

unit said:
How was edge retention gauged? I have heard many discussions on this topic on other forums and there is wild variance to the answer to the question, "when do you sharpen?"

Would anyone care to point me to the metric used for edge retention in this test?

I judged edge retention by comparing how cleanly the knife edge cut paper. Using 2" wide stips of paper I tested the edge as recieved and again after completing all the tests. If the edge cut as cleanly at the end as it did in the begining it got top points from me. But the difference was often very slight, a knife might still seem to slice the paper cleanly, yet examining the paper might slow a slightly coarser cut. Like Tony has stated, all these knives showed great edge retention, and our testing was not enough to seriously dull any of these blades.

I even avoided the spec's that Marcelo was getting from the makers, and did not know anything about the different blade steels (unless marked) or heat treatments during the testing; keeping any preconcevied ideas out of the equation.

I was also very consistant with my test regime from knife to knife. In the performance testing the same materials where used for all the knives, and each knife was tested in the same sequence, starting with the easier tasks and ending with the harder chores.

I'm no expert, but I used each and every knife in the test as I use my own knives. I took this and applied it to the test parameters set forth. I let each knife show me what it was capable of, and scored it accordingly.

Scoring aside, they are all great knives. :thumbup:

:thumbup:

Big Mike

x39 · Apr 6, 2011

hollowdweller said:
Now the blades I have where the makers actually farm out their blades to a heat treating service, those blades seem to be all uniformly well tempered and there seems to be no difference in edge retention.

However when the maker does them themselve they seem to vary widely.

This could be caused by a number of factors including inconsistancies in how the material is heated, if critical temperature is uniformly reached, the quenching medium, and how the material is drawn after hardening. Unless the maker is quite skilled, the only surefire method is to use material of known origin, by the numbers in a thermostatically controlled furnace.

Rick Marchand · Apr 6, 2011

hollowdweller said:
Now the blades I have where the makers actually farm out their blades to a heat treating service, those blades seem to be all uniformly well tempered and there seems to be no difference in edge retention.

However when the maker does them themselve they seem to vary widely.

It all depends on the maker's skill, equipment and choice of steel. For some steels, a variance of 15degs can make or break the heat treat.

hollowdweller said:
Perhaps some of the worst edge retention of any knives I have are the supposedly L6 ones treated by the maker.

L6 is a toughy to nail in the HT. If you do not have (at minimum) a digitally controlled kiln, you are wasting your money on a steel that you will never get the most out of. Depending on the source of L6 (Carpenter or Crucible) you may be getting very different characteristics because of the difference in moly content. L6 can make a hell of a tough, impact resistant blade when done right.

Rick

kgd · Apr 7, 2011

I completed some statisical analyses on the BF camp knife contents. Overall there were 21 knives tested across 13 testing categories. All scores were graded by the three reviewers with the exception of Food Preparation which was only assessed by Big Mike. Since this category only involved one of three reviewers I omitted the category from my analysis since I was more interested in aspects such as the variation between reviewers and how this contributed to overall scores.

The first test I did was to simply compare the overall score (sum of 12 testing scores excluding food prep results) between the three reviewers. Since Tony Is THE MAN, I made him the x-axis and I placed Marcelo's and Big Mike's scores on the y-axis of the figure below. Each symbol represents the results of a specific knife (green stars = Marcelo; red circles = Big Mike) as contrasted against Tony's score. The diagnol black line represents a perfect fit. Symbols that fall on this diagnal line have a perfect correspondence between Tony and either Marcelo's or Big Mike's scores.

Basically, there was a very good correlation between the overall score generated by Marcelo and Tony and that generated by Big Mike and Tony. Tony and Marcelo were more often in agreement with one another compared to Tony and Big Mike. There was also quite a bit of disagreement between the reviewers on a few knives. This includes scores provided for C. Bryant, GW Schmidt, Pinault and Mud Creek where a low overall rating by one reviewer had a larger influence to the average score for these knives. In two cases this would not affect the overall ranking of the knife to a great degree. In two other cases this would affect fine scale positioning within the pack of results but would not influence the top performers in the contest.

For stats junkies, linear regression analysis indicated that Tony's performance scores explained (or would be predictive) of 72% of Marcelo's scores (adjusted R2 value) and 60% of Big Mikes Scores. In the two regression analyses, the slope was not significantly different from a value of 1 nor was the constant significantly different from a value of 0. This means that the 1:1 regression line provides an adequate description of this correlation, essentially as good as the best fit least squares regression does. It also implies that our three reviewers were overall of like mindedness (with some exceptions) in their ranking of individual knives.

The second statistical assessment I performed was more complicated but also (I think) interesting. I performed a multivariate analysis test called principle components analysis using the testing scores across all categories. This analysis is a type of statistical approach referred to as a data reduction method. Essentially what it does is finds correlations between the different types of tests across the different knives and lumps this variation into a reduced set of test metrics. The simplest analogy is that Marcelo lumped all the tests into a single test metric called performance which was the sum of all test scores. In the PCA, we are reducing the number metrics overall and limiting it to two or three testing metrics instead of just one as in the case of Marcelo's test results. This allows separation of knives into groups that perform similarly across a series of tests. Essentially, it allows us to examine if there are patterns present. For example, do knives that behave very well in a couple of testing categories show compromise in other testing categories?

After running the principle components analysis, I found that 3 significant components came out. We will call each of these components a clumping of similar (correlated) test outcomes across the knives. The first component had the highest number of tests correlating to this axis. Basically, individual knives that show high scores on component 1 tended to receive higher test scores in the following metrics: Whittling, Push Cutting, Control, One Stick Fire, Drilling, Ergonomics, Push Cuts on Fiber. They also tended to do well (but not as strongly correlated) for Fit & Finish, Edge Retention and Sheath. Some of these correlations make sense. For example, a knife maker who pays attention to fit and finish is likely to also produce a fine looking sheath. Similarly, we find many performance aspects related to wood work and cutting correlated here. The second component only had 2 tests that corrleated to this axis. This included moderate negative correlations for Push Cut Fiber and Batoning. This means that knives with a low score (or high negative value) on this axis tended to perform better for push cut Fiber and batoning. The third test had only one variable associated with it and that was sharpening.

Okay so the above might be a bit complicated. Let me explain the graph below since visually this makes more sense and demonstrates the power of this kind of statistical analysis. I plotted the average scores across the reviewers for each knive in two dimensional space focusing on component 1 and component 2 scores. Each dot on the graph represents a specific knife and the label corresponds to its maker. The error bars plot the variation around the knife scores against he x or y- axis. Knives that have large error bars basically mean disagreement between reviewers on the absolute position of the knife. Knives that fall closer to the right bottom quadrant are the ones that test highest across multiple testing metrics. Knives that fall in the upper left tended to have lower scores across all the components (except for sharpening or food prep which is not represented here).

We see from the above that there are different clumps in this 2-dimensional space and the straight forward assessment of Who Wins is not quite as easy to judge. Culberson still wins on the component 1 metrics followed very closely by Wildertools. However, if you were somebody who really valued the fiber push cutting test + batoning, you might be more orientated to suggest NWA and Ban Tang as the contest winners.

Truly, considering the error on the x-axis which represents a wide range of test metrics, it is useful to note the proximity of many knives found within what could be construed as the "winners circle". These include: NWA, Ban Tang, Laconico, Fiddleback and X39. These latter five could be construed as having very high scores on the X-axis. Note that X39 and Fiddleback, exhibited some compromises on the 2nd component - that relates to slightly weaker scores for batoning + push cut fiber.

Expanding this out and trying to be objective, I would say it is very difficult with any degree of confidence to truly distinguish many of the contest knives. More broadly, knives performing very well across multiple metrics include: Culberson, Wildertool, Ban Tang, Laconico, NWA, Fiddleback, X39, Grewywolf, D Phillips, C. Cody, Scout, Koyote, Turley, Fletcher and Gossman. (The proper statisical test to examine this would be cononical covariates analysis, but we don't have nearly enough reviewers to objectively perform such a test). We then see two lower ranked groups of knives. The mid-ranged knives include: Nightman, Pinault and Mud Creek). The Lower ranked group is C. Bryant, GW Schmidt and AA Forge fits somewhere between these two. I note specifically that in the latter groups, error bars tend to get wide which means there was considerable disagreement between reviewers on these knives. Particularly the case for C. Bryant and AA Forge on the 'batoning/push cut fiber' tests there seems to be wide disagreement (see also the correlation tests noted above).

So overall, these tests give us a slightly expanded picture compared to the average sum of scores presented by Marcelo in his test outcomes breakdown. Personally, I was expecting that fit and finish testing categories would clump out independently relative to the performance metrics but I was wrong in this assertion. I was also hoping that the analysis would show more wider separation of test metrics across component axes but instead found that one axis explained many of the test results. I also would like to say that knive makers were not as far apart from each other as perhaps the average total scores might suggest. The winners in this case were not so far ahead of the pack as the additive scores might make them out to be. This isn't of course to take anything away from Culberson and Wildertools. Its only to say that many of the makers performed very well and its up to relatively nuanced judgements of a very small set of reviewers relative to knife entries trying to figure out who won.

Hope you guys enjoyed this little diddy of statistics fun. Now before I hear somebody saying that statistics can be made to say anything, let me just say that I wasn't looking for an answer here. I just wanted the patterns to speak for themselves. In this regard, the stats told me quite a bit and new information that can be quite difficult to deduce by simply staring at a table of numbers.

Ken

koyote · Apr 7, 2011

thanks, Ken! Excellent analysis.

Believe it or not, I understood that. (there's a test I'm interested in that's missing, but I can handle that later)

scout_77 · Apr 7, 2011

Ken,
Thanks for this added info, takes me back to my days at HP.
Very interesting, and I'm glad I didn't have to do it.

Alan2442 · Apr 7, 2011

soooo....whats next (insert innocently inquisitive voice)? Choppers...folders...some other interesting form of knife I have yet to learn about, but absolutely must have?

no but seriously, like i said earlier this was a great thread. do you guys have any ideas/plans/desires to do another project like this?

~edit~ if you do decide that you want to undertake something like this again, and there is anything I can do to help please let me know.

DennisStrickland · Apr 7, 2011

all of us owe tons to these fellows for all this effort. i've only be on the forum for less than 2 years but this was really impressive. ca'nt say i comprehended all of it especially that graph but may i extend a huge thank you of gratitude.--dennis

tonym · Apr 7, 2011

Ken allot of that is confusing to me, but I'm not that bright as you know.

However, you note a few times that there was disagreement in a few knives. You mention C Bryant's knife quite a bit. I look at the scores and it seams we are about the same in scoring with the exception of battoning. Marcelo and I felt the very thin and short knife didn't baton as well as some of the other thicker knives, and Mike felt it did as well as Gene's 1/4" monster. So that's the only disagreement I see on that knife.

In terms of you pairing push cut fibrous with battoning, I don't see how those two tests have anything to do with each other. I tend to see that a knife that push cuts fibres very well will also whittle well due to the sharpness and edge geometry. However with battoning, I find the thick shouldered knives are better at separating the wood than the thinner shouldered knives which usually take more time to get through wood, those same characteristics will hinder it's performance at pushing through fibrous cordage or wood.

Agree?

gomle · Apr 7, 2011

First, many thanks to the testteam. Great job!

"you might be more orientated to suggest NWA and Ban Tang as the contest winners"

Since I picked up both these contest-knives I'm quite happy with this conclusion

yessir, real happy

kgd · Apr 7, 2011

tonym said:
Ken allot of that is confusing to me, but I'm not that bright as you know.

However, you not a few times that there was disagreement in a few knives. You mention C Bryant's knife quite a bit. I look at the scores and it seams we are about the same in scoring with the exception of battoning. Marcelo and I felt the very thin and short knife didn't baton as well as some of the other thicker knives, and Mike felt it did as well as Gene's 1/4" monster. So that's the only disagreement I see on that knife.

The stats are just picking up cases where variation is high between the reviewers across different test scores whereas you are comparing the absolute score for the sum of tests. In the case of C. Bryant you will notice a 2 point score differences across reviewers for Edge Retention, ease of sharpening, control and 1.5 point differences across drilling. Also this is one of the few knives where the three reviewers did not come up with the same score out of 5 for any of the individual tests. However, as noted in this case, the relative rank for this one remains pretty similar across the three reviewers, its just that scores differ across multiple tests.

tonym said:
In terms of you pairing push cut fibrous with battoning, I don't see how those two tests have anything to do with each other. I tend to see that a knife that push cuts fibres very well will also whittle well due to the sharpness and edge geometry. However with battoning, I find the thick shouldered knives are better at separating the wood than the thinner shouldered knives which usually take more time to get through wood, those same characteristics will hinder it's performance at pushing through fibrous cordage or wood.

Agree?

Tony - I did not pair these two tests together, the statistics did. The principal component analysis attempts to clump different tests that show similarities in performance for like knives. Sometimes these correlations make sense and other times they can be surprising. I agree with your assessment of what you expect on batoning and push cut fibers. It just so happens that several knives did well on push cutting and batoning and therefore the PCA lumped these together even though the reasons about why a given knife performed well for each test may differ.

Of course you might be completely right. We probably need more reviewers to really see if these patterns are in fact robust. However, sometimes its the surprises in these correlations that give you pause to think and revisit what your intuition says.

chopchop · Apr 7, 2011

Ken, you must be a research scientist! Nice analysis and explanation.

It would be interesting to see how these knives would group with more testers, but I realize this was a ton of work to do. We always want more data, right?

Man, I love these challenges! I just need more cash to be a better patron.

tonym · Apr 7, 2011

kgd said:
The stats are just picking up cases where variation is high between the reviewers across different test scores whereas you are comparing the absolute score for the sum of tests. In the case of C. Bryant you will notice a 2 point score differences across reviewers for Edge Retention, ease of sharpening, control and 1.5 point differences across drilling. Also this is one of the few knives where the three reviewers did not come up with the same score out of 5 for any of the individual tests. However, as noted in this case, the relative rank for this one remains pretty similar across the three reviewers, its just that scores differ across multiple tests.

Tony - I did not pair these two tests together, the statistics did. The principal component analysis attempts to clump different tests that show similarities in performance for like knives. Sometimes these correlations make sense and other times they can be surprising. I agree with your assessment of what you expect on batoning and push cut fibers. It just so happens that several knives did well on push cutting and batoning and therefore the PCA lumped these together even though the reasons about why a given knife performed well for each test may differ. .

Gotcha. Very good sir.

kgd said:
Of course you might be completely right. We probably need more reviewers to really see if these patterns are in fact robust. However, sometimes its the surprises in these correlations that give you pause to think and revisit what your intuition says.

It's less about what my intuition tells me and more on what my notes and the hours of video I took during the tests tell me. I go back to the video on certain knives where I see a big difference in results and can't figure it out. Like one of the testers gave Koyote a low score on whittling, when I used the knife last and that thing cuts like a freaking champ. I think the tester was trying to whittle with the spine..

Bruce Culberson · Apr 8, 2011

Thanks for providing another perspective on the results Ken!

Stump Buster · Apr 8, 2011

kgd said:

ummmm

... Yup...That's how I had it figured too

....

...GREAT KNIVES submitted by GREAT MAKERS!!!! Too Close to Call on this one I think......CONGRAT's to you all for an EXCELLENT THREAD!!!! :thumbup::thumbup::thumbup:

THANKS to all the Artists and Judges for all the hard work that went into this project!!!

Stump

Bushcraft Knife Challenge results!!!

Donkey on the Edge