Friction Forged Blades : CATRA tests

Now I am going to show you how this is an example of really misleading and biased statistical analysis.

This is an example of a lot of precise numbers which are not utilized properly and the conclusions presented are not rigerously supported by the analysis.

-Cliff

There is again a huge data bias here because the FF D2 blade is cut for longer and thus goes into the plateau that all blades reach.

My main point was that their analysis is very selective.
In order to do this sensibly, you would want to rerun the CATRA tests at least 3-5 times to smooth out the curves and see the real behavior from the noise and then apply a nonlinear curve and use the coefficients to discuss the differences.

-Cliff

If you want to look at selective parts of the data you can support any conclusion you wish to propogate. That is one of the common ways statistics are used to distort reality.

-Cliff

Nonsense, the ones who are confident in their results release the information like Glesser. All such analysis is SUPPOSED to be subject to critical analysis. Science doesn't proceed on faith and blind worship.

-Cliff

Now if you just want promotion then that is fine, but the above offers nothing significant in terms of actual understanding of the performance of the steel or even any logical attempt to explain it. Of course I would not really expect that in such a document which is why I started the thread here to see if such information could be attained and to show why the above is flawed from such a viewpoint.

-Cliff

Have you verified your precision by some simple methods to remove any subjective bias. There are a couple of things you can do readily to ensure that you are not in any way effecting the results by personal judgement.

-Cliff

Just a few quotes from Cliff.

To my unscientific mind, it shows more than simple scientific curiousity as to the claims, and more of an agenda to say that it is all a bunch of hoooey.

Wayne Goddard is a mentor to many, and has personally helped me in areas of metalurgical understanding, knife testing and writing about knives and swords in a way that the layman can understand. In case any of you had any questions about Wayne Goddard's curriculum vitae, he has been making knives since before many of you were born, knives that perform very well, based upon feedback from many users out there.

Best Regards,

STeven Garsson
 
Yeah well big deal Steven. Cliff has a web page, has numerously sent naked pics of himself to Mike swaim for years, and can sing Old Mc Donald wile sharpening a knife with no notes or help from anyone else. :)
:)
 
Yeah well big deal Steven. Cliff has a web page, has numerously sent naked pics of himself to Mike swaim for years, and can sing Old Mc Donald wile sharpening a knife with no notes or help from anyone else. :)
:)

:D There you go trollin' again Bill.
 
Yeah James and I didn't even need any push from you to do it. I miss those days. :) Good to see you.
 
... I could find was that the graphs are deceptive because of the way they're set up, and he didn't even say it was intentional.

It probably isn't. But that really doesn't make any difference here because all that would be is a character issue on the part of the person and I was talking about the data presented, what was infered from that and what actually could be inferred from that.

Few people understand the mechanics of error propogation and there are entire books written on how statistics can be misleading especially in regards to plotting. I have noted several times in the above how you can compare knives and be perfectly truthful but still be very biased.

It is obvious here that most people in this thread don't even know what that word biased means. People are looking at the speaker and not the content which itself is a huge bias and ironic when the same people are arguing an unbiased perspective. The minute you support something because of who said it you are proceeding on faith.

In fact the use of CATRA data on a knife is itself biased if the results are going to be used to infer human performance. This again will have people like Thomas ranting but there are trivial example that show this to be true which are just based in math. There are knives made out of materials which can run on a CATRA testor for a day (yes, full day) and show NO DEGREDATION.

To be clear, the knives show no loss of edge at all. Now would that mean you would immediately jump and start screaming how you found the ultimate blade material. Yes, if you were a machine then you would be really excited but you are not and there is a huge difference in how a person uses a knife and a machine and the CATRA machine ignores this and thus the data set is baised by defintion if you make that inference.

The last sentance is the critical part - data can not be biased in an absolute sense, it can only be said to be biased once it is made clear what it is supposed to be representing and what is being inferred. As an example, I could record the number of men and women in a class room of people and say that the result would indicate the division of men and women found in that particular field of study. This you could argue is unbiased. However if I said that the same data indicated that division in general it should be obvious that the data is hugely biased.

In the above when I talked about fatigue and how to interpret the results I mean that type of comparison has that meaning, not that the CATRA tests imply it for people. I have already noted in the past why you can not make such an inference from that data. Other makers such as Landes have also noted clearly there is a huge difference between such work and how knives blunt in peoples hand. Again, the difference is just math - not opinion. Fact is that even Buck noted that they had knives which showed the CATRA tests to be meaningless as they didn't represent at all what happened when a person used the knives.

-Cliff
 
Very interesting, informative and entertaining debate... I have to agree with Cliff that the ultimate test is what happens when a person uses a knife.
 
It probably isn't. But that really doesn't make any difference here because all that would be is a character issue on the part of the person and I was talking about the data presented, what was infered from that and what actually could be inferred from that.-Cliff

Cliff, people usually go after character, when it is you, because you frequently come off as an ass. You make a lousy martyr as well.

Wayne, Carl and Tracy are all here to give data, answer questions and share with the community, and you don't even have to basic decency to welcome them before you start refuting data, and making accusations of bias.

I am a betting man, and one of faith as well, and I am betting that when it comes down to it, that Carl and Tracy will hand you your ass on a plate with data, intellectually speaking.

That will not be able to be proven by data, but by interpretation. I have known many, many reseachers, and they all say that data can be manipulated by how it is interpreted. That is the human part of the equation of science.

Bst Regards,

STeven Garsson
 
Cliff, you have to really stop with this personal name calling. Wayne Goddard; deceptive, plotting, and misleading? Are you kidding me here? Unreal.

The use of the term plotting, was as in "plotting points on a graph", not "plotting to take over the world".
 
sodak, yea I get it. Now what about the deceptive and misleading part, and then there is STeven post (#121) as well?
 
Cliff, you have to really stop with this personal name calling. Wayne Goddard; deceptive, plotting, and misleading? Are you kidding me here? Unreal.

In which post of Cliff's was the "personal" name calling you accused Cliff of?
 
Well 3G, when you start calling out someone for being deceptive and misleading, I call that personal, ok?
 
Cliff does seem to be showing some bias in his criticism of Wayne's testing method. Wayne gave a general overview of his testing method for our benefit. He did not mention the precision of the scale he was using (nor did he need to for the purpose of his post). Cliff immediately criticizes Wayne's methods and suggests his methods are affected by personal judgement.

Wayne, given the highly nonlinear behavior of blunting I would be really skeptical of a cutting ratio estimate within 5% by hand, especially on a scale which has a best the ability to measure sharpness several times that coarse an estimate. Just consider the following.

Lets assume your knife starts off at 5 lbs to slice through 3/8" hemp. You stop cutting once it reaches say 10 lbs. Now out of that first 5 lbs, about 1 lbs is for the sharpness. Thus you went from 1 lbs to 6 so your sharpness is 15% of the initial. Note that even a 1 lbs increase in force then corrosponds to a degredaton of 50% in sharpness.

In order to work at the 5% level you would need to be able to look at 1/10 of a lbs or so and this is just the start because that is just your measurement requirements and doesn't take into account the random variations in the steel, sharpening, cutting methods, and material.

As an example of the kind of variation seen in machine data, see Landes data where the variance is so high that steels switch places in terms of which is superior because of the influence of the random nature of how many carbides intersect the final sharpened edge line.

Have you verified your precision by some simple methods to remove any subjective bias. There are a couple of things you can do readily to ensure that you are not in any way effecting the results by personal judgement.

-Cliff

If Cliff were truly unbiased and interested only in the propagation of accurate information, he could have simply asked Wayne what the accuracy of the scale he uses is. If Wayne's scale is less accurate than the 1/10th of a pound that Cliff mentions as a starting point, then maybe there would be grounds to speculate about the inaccuracy of the testing method. A true scientist would have asked for more data before posting such an attack. In
the last two lines of the post I quoted, Cliff finally asks for more info, unfortunately it is still in the form of an attack.

It is pretty funny that after Cliff criticize Wayne's in the hand testing methods, he complains that machine testing methods cannot accurately model the way a human uses a knife.
 
Carl and I have appreciated the great questions and technical discussions thus far. Our objectives are to provide as much technical information as we have available, and also to let you know what we do not have. We hope that the information and discussions will provide you with some technical background in regards to the FF technology that may support you as a knife enthusiast, whatever your background or experience.

We do not have all the answers. Your discussions have been invaluable to us. We have made copious notes of those technical questions you have asked to which we do not have answers. As Carl previously stated, “In almost every technological advance, the advance comes before the science is fully understood.” We will continue to provide this technical information and data as we gather it.

Carl and I have spent some time this weekend getting to know Cliff. Cliff has a strong background as a PhD in Physics. He has spent considerable time in curve fitting data to explain different phenomenon. We have also read much of the materials he has posted on the web in regard to knives. We clearly have differences of opinion on some technical issues (and the definition of bias). However, we will continue constructive dialogue on technical issues so we can all come to a better understanding.

Let’s get back to some good technical discussion of steels and knives. I am a dumb Metallurgist and love to discuss, well you know. But, I am an amateur when it comes to knives and I am learning a lot from everyone here.

TN
 
Let’s get back to some good technical discussion of steels and knives.

TN
Most likely the best thing written over the last 24 hours or so. I'll look forward to hearing more on your FF technology.

I apologize for my part of this interruption.
 
Well 3G, when you start calling out someone for being deceptive and misleading, I call that personal, ok?

Thomas,
I guess I misunderstood what you were saying in the post of yours I quoted. I got the impression from what you stated that Cliff had actually called Wayne Goddard, the man, "deceptive, plotting, and misleading." When I re-read all of Cliff's posts and didn't find that, I asked you for clarification.

I am sorry for my part in derailing the thread. I am really enjoying the abundance of knowledge and experience this thread has brought forth. I for one am glad Cliff started this thread. If nothing else, this thread has allowed us to hear from Wayne Goddard, a man whose work I truly admire!

Regards,
3G
 
@cds4byu
See his data upon stainless. page 128 ff you will find definitons and why D2 is not a stainless.

We understand that D2 is not stainless. Never have we said that D2 is stainless. We have said that the FFD2 edge region is stainless, or at least significantly more stain resistant than the D2 body of the blade.

Thanks,

Carl
 
In fact the use of CATRA data on a knife is itself biased if the results are going to be used to infer human performance. This again will have people like Thomas ranting but there are trivial example that show this to be true which are just based in math. There are knives made out of materials which can run on a CATRA testor for a day (yes, full day) and show NO DEGREDATION.

To be clear, the knives show no loss of edge at all. Now would that mean you would immediately jump and start screaming how you found the ultimate blade material. Yes, if you were a machine then you would be really excited but you are not and there is a huge difference in how a person uses a knife and a machine and the CATRA machine ignores this and thus the data set is baised by defintion if you make that inference.

Very interesting, informative and entertaining debate... I have to agree with Cliff that the ultimate test is what happens when a person uses a knife.

Please let me clear up a misconception, which I tried to address earlier. Apparently it didn't take.

The test we have performed is not a CATRA Edge Retention Test. This was a deliberate choice on our part. The CATRA ERT uses cutting performance in a very stiff medium to measure sharpness, and uses a medium with embedded abrasive (silica) to wear the blade. As Cliff noted earlier, the hardness of an abrasive medium is very significant when looking at sharpening knives of different metallurgy, because the hardness of the abrasive relative to the hardness of the carbides in the blade is very important.

As an example of this, please look at the the following photos:
TAL-SH5.jpg

This is a scanning electron microscope image of a Talonite blade sharpened according to our test sharpening procedure. Notice that the final microbevel is approximately 20 microns wide (20 microns is about 0.0005 inches). You can see the presence of the carbides in the alloy, because they're slightly raised compared to the matrix. But the carbides have been ground to the edge geometry, and this blade is shaving sharp.

TAL-CA10.jpg

This is an SEM image of a Talonite blade after a CATRA ERT test. Note that the original sharp edge geometry is gone, and that the edge is rounded and has protruding carbides. This edge won't shave, and when handled feels dull. But because the carbides protrude from the edge, the blade continues to cut the CATRA media Results like this convinced us that the CATRA ERT test is not a good test for observing how long a knife stays sharp when used in cutting the kind of less-rigid materials that are usually used with hunting knives.

Therefore, we developed our own edge retention test

We wanted a test that would measure sharpness by measuring the amount of force required to cut into a flexible medium. Fortunately for us, CATRA makes just such a piece of test equipment, the Razor Edge Sharpness Tester (CATRA REST Tester). This test performs a push cut on a flexible silicone medium and measures the maximum force required to initiate a cut. I believe it's a significantly better measure of sharpness than the cutting ability in the CATRA ERT, but you can feel free to disagree. Nevertheless, it's what we chose.

We also wanted to use a typical biological material to wear the blade. Rather than using sand-impregnated paper (which is the CATRA ERT test media), we chose to use hemp rope. The hemp rope should be softer than all of the materials in the edge region of the knife, and thus the effects of variable hardness in the edge should be minimized. We wanted to be fair, and to see how the sharp edge geometry fared in wear tests.

Therefore, we made a machine that is similar to a CATRA ERT machine, but uses 1" hemp, rather than CATRA wear media.

There are two possible metrics for how much wear was occurring on the blade: the number of strokes and the total media cut. Therefore, we measured both.

The test procedure is straightforward. We measure the REST sharpness, then cut the rope for 20 or 40 strokes, then measure the REST sharpness again.

There are at least two significant differences in the results of this test compared to the CATRA ERT test. First, we don't have sharpness data for every stroke. We only have sharpness data when we stop cutting rope and go measure the sharpness. We've done this, because we want to measure the user experience, not the machine experience.

Second, the media is significantly less abrasive than the CATRA media, so that wear occurs much more slowly. We stop the test when the blade stops shaving, which is still in the "early blunting" region of the ERT test that Cliff refers to. We do this on purpose. We want to see when the knife stops being sharp, not when it's so dull you can't stand it anymore. We tried to model the effect on a user who wants to keep his knife so sharp it'll pop hair off your arm.

Because this is a different test than the CATRA ERT test, there is no reason for ERT analysis methods to be applied. That's why we didn't do it the same way Clilff has done it. But we're happy to make data available for others to analyze any way they see fit. But when you use the data, you need to understand that 20 strokes in our test is much less wear than 20 strokes in a CATRA ERT test.

I hope this is helpful in explaining our testing methodology. It's different from CATRA ERT for a reason, and the reason is not to make FFD2 blades look better; it's to try to better measure the user experience.

Thanks,

Carl
 
If Cliff were truly unbiased and interested only in the propagation of accurate information, he could have simply asked Wayne what the accuracy of the scale he uses is.

I have discussed his methods in detail with Wilson in the past who uses methods developed from Goddard.

If Wayne's scale is less accurate than the 1/10th of a pound ...

You could never maintain that level of precision by hand anyway even if you were using a digital scale. What you would have to do if you wanted to do this would be to use a force probe and integrate the force over the cut. This doesn't even allow you to get that level of precision though (I have done it in the lab) because there are systematic difference which are higher as I have noted before.

It is pretty funny that after Cliff criticize Wayne's in the hand testing methods, he complains that machine testing methods cannot accurately model the way a human uses a knife.

Learn the difference between precision and accuracy. One method is low in one, the other low in the other.

We clearly have differences of opinion on some technical issues (and the definition of bias).

Mine is the math one as we were discussing maths.

Therefore, we made a machine that is similar to a CATRA ERT machine, but uses 1" hemp, rather than CATRA wear media.

None of the differences you note have anything to do with the bias I noted in the CATRA testing.

[analysis]

That's why we didn't do it the same way Clilff has done it.

And I showed why your numbers and methods are severly flawed and produces not only misconceptions but is unable to actually prove numerically that there is a significant difference, again with a math defination, not an opinion.

-Cliff
 
Here is the FFD2 and CPM S90V modeled :

ffd2_s90v.gif


There is no statistical proof that the FF blade has superior edge retention because the blunting factors are too uncertain because of a combination of three factors :

1) lack of data in the initial region where the rate of blunting is high
2) lack of data in the tail region of S90V
3) high uncertainty in the data

In regards to the initial low cutting ability of S90V. Based on examining the actual data there is a misconception in the above. It was stated that the S90V blade does not cut as much material at the start. The first point measured is AFTER 20 CUTS. This is much too far too make such a statement as the initial rate of blunting can be very high for large carbide steels, especially if the edges are acute and/or polished. I would affirm if the cuts were measured at 2,4,8,16 then you would see a much higher initial performance from S90V.

As I mentioned in my previous post, this statement would be correct if the test were a CATRA ERT test, with silica-impregnated cardstock used as the cutting medium. However, it's not correct for our test.

To demonstrate that the 20 stroke measurement is consistent, I've included a plot of the total rope cut as a function of stroke number for the first 20 strokes. As you can see, the cutting performance is very linear for the first 20 strokes, and you can see that the proposed "much higher initial performance for s90V" doesn't exist.
s90V-Initial-Trend.jpg


The above data is also a very poor way to compare the blades because what it does it take two dependent variables which are highly noisy, divide them and magnify the noise and then plot essentially two dependent variables so now both the x and y axis are highly noisy. This actually reuquires very complex methods to fit properly even with simple models because this adds another element of nonlinearity into the problem. What should have been plotted is just the stroke count and amount of media cut. This would give a much smoother curve as the x axis variable is independent then and has no noise. It is also much more straightforward to rationalize as it just shows how much media is cut after a certin amount of passes.

This figure is not our preferred figure to plot. However, we used it because it's the same plot specified in ISO 8442.5:1999, according to CATRA
http://www.catra.org/pages/products/kniveslevel1/slt.htm
Because the figure is a bit hard to see, I've also got a link to the CATRA brochure that was sent to us on the ERT machine http://www.et.byu.edu/~sorensen/Catra%20Automatic%20Edge%20Tester%20Brochure.doc.
We wanted to show the performance in the same type of plot that was requested by the standard. I can understand that you might prefer to use a different plot. But our plot was not chosen to maximize the apparent performance of FFD2, as you imply.

The above graph attempt to show the sharpness after a certain amount of media is cut but as I noted it magnifies the noise and introduces an additional nonlinearity. I'll look at the other data shortly.

The data may be noisier than you prefer, but it's clearly not even close to the limit for determining which of the two blades has the higher performance. Since you have a PhD in physics, I'm sure you understand the optical resolution limit, which is also very similar to the statistical test for determining whether two sets of data come from different population. In both cases, if the mean of the first set is separated from the mean of set two by a factor larger than the standard deviation of the data, the difference can be determined through statistical processing. As you can see from the figure, the two data sets are not even close. I can understand that you might disagree with the exact amount by which the FFD2 outperforms the S90V, but I can't see any plausible way to interpret the data to mean that the FFD2 doesn't significantly outperform the S90V. Not only is the initial sharpness higher, but it drops at a lower rate, even though it hasn't reached the plateau (and won't, because we don't test to the point where the edge is gone and we're just relying on the carbide "teeth").


As I said several times in the above, and noted in detail in the sharpness article, there is push cutting and slicing sharpness and you can be very high in one and low in another. If the slicing aggression is much lower initially, regardless of the push cutting test, it would indicate improperly formed edges which you would verify under magnification.

I understand the difference between push cutting and slicing cutting. Our primary sharpness measurement is push cutting, because that's how you shave, and shaving sharpness is our goal. In our tests, total media cut is intended primarily as a measure of the amount of cutting work done to degrade the sharpness, not as a primary measurement of the sharpness itself.

In my opinion, the figure that best shows the improved performance of FFD2 compared with the other materials is this one:
Cut-Rest.jpg

On testing a number of different blade materials with our test geometry, we found that a REST value of 3.0 N represented a good mean value for the loss of shaving sharpness. As you can see, FFD2 has significantly more media cut than any other material, and yet has not exceeded a value of 3.0 N. This is also consistent with Wayne's hand rope cutting tests.

At the angle you noted FFD2 will easily outperform AEB-L because the edges are inherently stable.

I have no basis to evaluate this statement, because I have no experience with the performance of AEB-L. In an earlier post, you suggested that we ought to perform tests at 10 degrees, 20 degrees, and 30 degrees to see how that affected the relative performance of the steels. We have not done so. However, our test is in the middle of your suggested range, so it seems to me to be a reasonable angle to use.

Cliff, I want to thank you for the evaluation you've done with our data. While we may disagree as to the extend of the difference, as to the reliability of the data, and as to how much we can generalize this data to include all reasonable edge geometries, at least I think we can agree that we're both interested in repeatable, quantitative, analytical methods of analyzing blade performance. And we both agree that, at least for this particular test, FFD2 outperformed S90V.

Thanks,

Carl
 
Back
Top