D2 heat treat, edge retention testing

Nathan the Machinist

KnifeMaker / Machinist / Evil Genius
Moderator
Knifemaker / Craftsman / Service Provider
Joined
Feb 13, 2007
Messages
17,578
This subject comes up from time to time and I'll usually throw in my .02. I'm doing some testing today and I thought I'd actually document the process for you fine folks.

I usually document my process and save notes on the computer. I'm just adding some pictures and a bit of commentary here and I describe the reasoning behind my process.

In my early days of doing this, things went fairly quickly and were exciting. However, my D2 process is relatively mature at this point, so I'm pretty much just splitting hairs. So if you're looking for an interesting, fast paced read, just about any other thread on here will probably be more interesting...



Some background: When I first set out to attempt to improve my heat treat process, I started by comparing my work to some good quality commercial and custom knives, which became standards to compare against. You know, to see if my work was "good enough". This test isn't perfectly scientific because it is difficult to remove myself from the equation to maintain perfect objectivity, but I attempt to remain a "perfect little cutting machine". I do usually achieve repeatable results, and the results I find are often not what I'm expecting, so I believe I'm not fooling myself in my process. My test doesn't give me a number, but instead it shows me edge deterioration relative to other known edges subjected to the same cuts in the same media. I believe this kind of testing is very important when evaluating changes to your process, because just an HRC number doesn't tell you the whole story. I'll go into this more later.

My usual pile of test standards:

standards.jpg


These are:
my personal skinning knife made several years ago when I finally got D2 to really "click"
a high quality knife in D2
a high quality knife in W2
a high quality knife in S30V
a good commercial quality knife in VG10


Today I'm doing a variation of my comparative test because I'm trying to eek out some subtle differences, I'm not looking for gross differences right now (I have that pretty much figured out at this point in this particular on going project) so I'm using relevant standards and previous test specimens as standards. A knife in a simple steel like W2 is excellent to have in a bunch of standards, but it isn't going to tell me much in my evaluation of changes to my D2 HT as well as comparing against other D2 knives. Ideally, at this point in this on-going project, things are as apples to apples as possible.

I'm testing two things today. A variation to the HT process timing, and the addition of a thermal cycle. So I'm running two blades through in order to evaluate both changes. (I try to make only one change at a time when possible in a test blade)

Both test blades measure about HRC 61, as do two of the D2 standards.

My standards for this test:

standards2.jpg


These are:
1 my personal skinning knife (.010 edge, HRC 60-61),
2 a high quality knife in D2 (.015 edge, ~ HRC 61),
3 a high quality knife in S30V (.020 edge, ~ HRC 59),
4 a knife from my last batch (.015 edge, .010 tip HRC 61-62),
5 a knife with a known issue with retained austenite (.015 edge, HRC 61-62).

That's five standards. Part of the reason for so many is to make subtle differences between them and my test subjects more apparent, and part of the reason is to give additional illustrations of the testing procedure for this thread.

The blades being evaluated:

subjects.jpg


P1 and P2 are identical to each other in every way except one received an additional thermal cycle, ( .015 edge, .010 tip, HRC 60-61) I don't know which is which (they're marked under the tape). Test coupons that are metallurgically identical to these two test blades have already been broken and looked at under magnification, there is a difference, so I do have some preconceptions that I don't want muddying the waters.

I'm about to start the tests. As I sit here typing this, I haven't started yet, and I don't know what the outcome will be (though I have a pretty good estimate, based on past experience). But, I won't be surprised if the final outcome is my work compares well to the other standards. So it may, in retrospect, appear that this entire article is shameless self promotion. If so, please consider the fact that I have been working at this for years, and my process has been tweaked to create blades that perform well at these particular tests. If I were to change my test to batoning through a cinder block my work would probably not perform very well. I don't really expect huge differences between most of these test blades because they're all pretty similar apples, but I'm hoping for incremental improvements in P1 and P2.

The purpose of this thread is not the information in these tests. It is to actually describe the test. The outcome is relevant to me, but is not the purpose of this thread.
 
The test starts with cutting cardboard. A narrow part of the blade is used for the cut to eliminate blade length from the equation. They all cut an identical amount of the same cardboard. The cardboard may change from test to test, and even the amount of cardboard may change from test to test. But each knife cuts the same amount of the same cardboard as the other knives during a particular test. Then the edges are observed under strong light and 10X magnification and compared to each other.

Results after the first cut: I've cut ten feet of cardboard with each knife (70 feet total, my forearms are feeling it).

All the the knives are dulled somewhat, though they will still shave hair except the s30V. I can draw no conclusions from what I see so far. (did I mention that at this stage of the game, they're all very similar?)

Results after the second cut: I've cut another ten feet of cardboard with each knife (140 feet total, my forearms are now really feeling it). Time to draw some conclusions:

I do this by sorting them from worst condition to best condition, by observing the edge under bright light and magnification.

test_1a.jpg


The S30V knife in this test (standard 3) is relatively dull at this point. The edge is wider and more washed out than the other blades and it won't shave hair. However, it is the only blade in the test without little shiny spots. This knife is pretty consistently a relatively low performer in these tests (compared to the other test knives). But don't get me wrong, it has been my EDC for years, it is quite good. So you see, everything in this test is relative.

Next is the blade with known issues with RA (standard 5). It is almost, but not quite the same as the other blades. I know from experience this blade will do okay until the last test. It has lots of little shiny spots. Shiny spots can be an area of RA that flattened out or rolled over, or it can be a small chip, or it can be an area where a band of carbide pulled out. They show up well under bright light.

Next is the high quality knife in D2 (standard # 2). This illustrates the fact that this is not always a perfect test because this blade will frequently beat out all the other blades in the test. However, on this particular day in this particular cardboard it didn't. It will not shave hair either.

The rest of the blades are too close to call. I can see no meaningful difference at this point of the test. They all have a few small shiny spots where the edge either chipped, rolled or a carbide pulled out, but these imperfections are on a very small scale. They all still shave hair.

The next test is where they'll really sink or swim.
 
The next part of the test is cuts in leather.

Leather is a natural material. It varies in thickness, hardness, and silicate content. I attempt to spread this variation out across the test blades by making one cut with each blade at a time as I go across the sheet of leather.

I have run out of the big nasty chunk of leather I used to use for this part of the test, so here I'm using some thinner softer leather, so it takes more cuts to dull a knife. Each blade cuts approximately 6-7 feet of leather (20, ~4" cuts)

pile_test2.jpg


Time to draw some conclusions, I do this by sorting them from worst condition to best condition, by observing the edge under bright light and magnification.

leather_test.jpg


None of these knives are as dull as they appear in this picture. Highlighting the edge with a bright light and taking a photo of it makes the little flaws look like small parking lots. With the exception of the S30V, these all still cut well.

The S30V (standard 3) is doing poorly, and by a large margin. It failed to completely cut some of the cuts. I can run the edge across my fingertip without fear of cutting myself. I included it in this test because it has similar geometry and is a high carbide steel like the test subjects. But I also included it because, being one of my standards, I already knew it wouldn't perform well and I wanted y'all to see this. When I first started doing these tests I would get a large spread in the performance of the test blades. But at this point I'm splitting hairs, and the differences are subtle. I put this stainless folder into the mix so you would at least see some of the spread I used to see when I first started doing this. And, hey look, D2 gave S30V a major spanking. *grin*

Next is the blade with known issues with RA (standard 5). Although this blade is the same steel, with the same geometry and hardness as the other blades, it is beginning to show its weakness here.

Next blade is both P1 and P2, my test subjects. This is disappointing because it means the tweak I made to my process did not have the effect I hoped that it would (I had hoped these would rise to the top). Sigh.

Next is my skinning knife, (standard 1)

Next is a knife from my last batch (standard 4)

And the blade with the least amount of wear is the high quality knife in D2 (standard 2). I'll give credit where credit is due, this knife is a Bob Dozier in D2. On this particular day it out performed my knives on the leather test. However, at this point, these knives are so similar that the order varies from test to test. I'm picking nits here, but calling it as I see it. They're very very similar, and so far represent the ultimate in D2 that I have been able to find. I keep looking for improvements, but so far it appears I've picked all the low hanging fruit.

One more test to go.
 
The last test is in some ways the best test, because there is usually a wide spread and little ambiguity. But it is also a somewhat less relevant test because it is an extreme test and doesn't really represent normal use. This is my hardwood whittling test. It tests edge stability, which is not D2's strength. This is an area where the W2 standard walks away from the others. It is also the area where I have made the most gains in my process.

This test chips and rolls the edge. It is the most difficult to perform evenly, so results have to be taken with a grain of salt. I carve a piece of wood with each blade, going from knife to knife, *trying* to keep my technique the same. After each knife has made five cuts I inspect the edge.

osage1.jpg


This test is brutal on a very thin, relatively hard, high carbide steel like this. I push the edge into the osage orange with my thumb while twisting the knife to carve out a little divot. It is unfair to the thinner sharper knives because they dig in more and can be more easily damaged, but in this case all the blades are thin, and I use the sharp area behind the belly that didn't get used in the other cuts so it is mostly apples to apples.

All of the cutting edges are somewhat damaged, none emerged completely unscathed, but it mostly requires a bright light to see the flaws. Every knife except one performed well, at this point I'm mostly just splitting hairs. But one stands out as quite bad. An order of magnitude worse than the others. This is very very interesting, because it is the same steel, same geometry and same rockwell hardness as other blades that do fine. What is the difference? *retained austenite* . This blade (standard 5) received a snap temper before being allowed to complete its quench. This was the standard D2 HT I received when I sent a batch out for HT several years ago to a respected heat treater. It represents the industry standard for D2. This is how many D2 knives out there are treated and this is a reason D2 has a reputation for taking a lousy edge and holding it. The best D2 knives are treated in ways to address its tendency to retain stabilized austenite. It's fugly. You don't need magnification, you can see the edge damage with it held extended at an arm's length

arm's length:

arms_length.jpg



detail:

detail.jpg



--------> *It is the same steel, geometry and HRC as the other blades.* <-----------

Folks, this is what I've been preaching about around here for years. Maybe you want to snap temper your simple steels, but don't do it with D2. Freeze it immediately. (Mf is generally listed around -100F)

Next in line for loss of edge in the hardwood whittling test is the high quality D2 knife (Standard 2). This is an excellent knife, and this test in no way shows any inadequacy at all. Having compared my work to Mr Dozier's for the last few years has been consistently humbling, I have a tremendous amount of respect for his accomplishments.

Next is the knife from my last batch (standard 4).

Next is test sample P1

Next is test sample P2

I'm pleased to see the test specimens performing well at the fine edge stability. These are hard thin knives in a high carbide steel tolerating edge abuse I would have thought unthinkable for something like D2 10 years ago.

Next is my personal skinning knife, (standard 1). It does not generally test so well on this part of the test. This is what I'm talking about when I say there is normally some ambiguity in these tests, particularly when the subjects are very similar, and especially on the hardwood test. It is very possible I simply didn't press it as hard as the others. And the same can be said for any blade in this test. I try to be objective and repeat exactly, but I am human.

However, this test did very clearly make the 1 blade with trouble stand out (standard 5, with RA). This is the purpose of these tests, and it was the point of this thread to illustrate this.

I'm not subjecting the S30V knife (standard3) to this part of the test. I have done it before, and I don't want to mess it up.


Test blade P1 appeared to hold up a little better than P2. I think P1 might be the one that got prequenched from 1600 for grain refinement. I'm gonna pull off the tape and see...

nope.

Not necessarily a dead end - but perhaps another example of something, like full cryo, that looks promising on paper, but didn't actually yield noticeable improvement in real use.

This concludes this test.

I can't speak for other steels like I can for D2. But I can say with some confidence that those of you who put too much faith in "industry standard" HT and the heat treaters we all use without actually testing what you're doing and comparing it to known good work could very easily have a problem and not know it. And, at least with D2, there was a lot of performance left on the table. I haven't made any big gains in years, but I keep working at it.
 
Nathan,

Thanks for taking the time and effort to test and write up your test; very informative.

David Sharp
 
Very interesting Nathan. Thanks for testing and showing the results. So you are saying take the blade out of the foil check straightness and correct if necessary then go directly to freeze treat. Do you mind telling us what your Austenitizing temp is and your soak time along with tempering times and temps. Thanks.
 
That's what I call quality entertainment. Very thoughtful and informative.

I want to hear more about your tests in which S30V does well, and W2.
 
Very interesting Nathan. Thanks for testing and showing the results. So you are saying take the blade out of the foil check straightness and correct if necessary then go directly to freeze treat. Do you mind telling us what your Austenitizing temp is and your soak time along with tempering times and temps. Thanks.

Darrin,

The test subjects were not heat treated by me. I do still heat treat D2 here, in fact I did some this weekend. I have a good reliable Lindberg Blue oven and a good reliable process. However, my couple thousand dollar investment in a heat treating setup pales in comparison to a really good industrial set up, so I'm working with a heat treater who is willing to work with me. I specify every moment and every temp of the process, ramp rates, quench rates, timing into freeze, time in freeze, cryo, tempering etc. I'm doing this because I believe there is more in D2 than I am able to get out of it with my setup at this point.

Short answer: austenitizing time is 30 min once the blade is at temp (about 45 minutes after I start the ramp up). Temp is 1850. These are very standard values and they work. I leave the blade in the foil, with the seams to the side where they don't interfere and plate quench. The rapid quench is important. I check for straightness and go directly into dry ice. Temper was at 450, 2X. Not rocket science.
 
That's what I call quality entertainment. Very thoughtful and informative.

I want to hear more about your tests in which S30V does well, and W2.

Salem,

Overall, the W2 is one of the strongest performers in the bunch. However the steel and the geometry are so different from my test subjects it didn't really fit into this particular test.

I have come to regard S30V as a bit over hyped as a "super steel". It does have excellent corrosion resistance though.
 
Thanks Nathan, I like you am a D2 fan. I am impressed and thankful for this information and your efforts. Jim
 
Thanks Nathan thats exactly how I've been doing it. Just wanted to be absolutely sure I was using the proper times and temps. Thanks again.


Darrin
 
Thanks Nathan,I too go directly to the freeze the only thing different is I'm using frozen plates. Though I have not done such exhausting tests I've carried AND USED my knives for years. To just go by a number or RC number would be pretty sad. The
alphabet soup steels?-theres going to be a new one next week. ---W2 stands alone.
Ken.
 
Good read, thanks Nathan. HRC values do not tell the whole story and can even be misleading, this kind of testing puts the rubber to the road, so to speak. Even numbers need to be verified.
 
But I can say with some confidence that those of you who put too much faith in "industry standard" HT and the heat treaters we all use without actually testing what you're doing and comparing it to known good work could very easily have a problem and not know it. And, at least with D2, there was a lot of performance left on the table. I haven't made any big gains in years, but I keep working at it.


Nathan, this is the first time I've understood why you took your position rallying against the science fundamentalists... it's still not a tenable posit in my eyes, as it assumes that the science guys aren't bothering to do any testing, however I guess if the layman doesn't bother doing any further research after taking the science guy's word for it, the results are only barely better than the usual north-facing slop-quench magnet/eyeballer guys.

Great test! I kind of wish you had managed a few more stainless blades in there to provide a good comparative to these, however I completely understand how bad your forearm and wrist have got to feel!

Although my findings with CPM S30v aren't quite as miserable as yours, they're certainly not that far off, either. There's a time and a place for the stuff, but only when you can't use one of the other high end tool steels!
 
Great test Nathan! I've struggled for many years with D2 HT and have had much better results when sending out to a pro like Paul Bos. I follow a similar method as to what you posted but have always wondered the optimum ramping and pre-soak times and temps. You specified that you give your heat treater exact ramp rates. What do you use here for ramp rates and pre-soak steps and how much difference have you seen when following these pre-steps?

Again Nathan, thank you for taking the time to share your results!
 
Nathan, this is the first time I've understood why you took your position rallying against the science fundamentalists... it's still not a tenable posit in my eyes, as it assumes that the science guys aren't bothering to do any testing, however I guess if the layman doesn't bother doing any further research after taking the science guy's word for it, the results are only barely better than the usual north-facing slop-quench magnet/eyeballer guys.

Great test! I kind of wish you had managed a few more stainless blades in there to provide a good comparative to these, however I completely understand how bad your forearm and wrist have got to feel!

Although my findings with CPM S30v aren't quite as miserable as yours, they're certainly not that far off, either. There's a time and a place for the stuff, but only when you can't use one of the other high end tool steels!


I'm not really on one side of that or the other. I have come to better understand the points of view of both sides. I do think a lot of the "anti scientific" rhetoric is tongue in cheek.

If you'll indulge me, some background for the benefit of anyone who hasn't been into this very long:

A long time ago folks began to revive the art of bladesmithing. They didn't have all the facts but they did pretty well. Eventually, in the vacuum of good solid knowledge, some of them spread information that wasn't totally accurate. This wouldn't be so bad, but some of these folks would use shoddy "science" as marketing to differentiate their work as "better" in the marketplace. And this wasn't fair to the folks working hard to make genuinely better work without the hype. So along came a bright young new guy who realized that a lot of the information was BS. So he set out to learn the metallurgy and apply it to his craft so he could illuminate reality for the rest of us. In my opinion, by helping to wash out some of the ludicrous BS that had accumulated, he did us all a great service.

I came into all of this at a point when everyone was still kind of standing around, laughing at the ridiculousness of the old misinformation. The trouble is, not all of the old people and all of the old ways were ridiculous.

I might snicker a bit if I were to watch my old step father dust off an old HSS corn cob mill and start slowing hogging out a work piece with it. There are now better ways to remove bulk material than that. But I better not snicker too loud because it is not my place to show him disrespect. Not only did he teach me a lot of what I know, he is still twice the machinist that I am. He can't operate the CNC, but he can make stuff that I can't.

My point being, some of what the old guys "knew" was wrong, but it isn't our place to ridicule them. wait... that's isn't my point at all...

My point is that a lot of the stuff that I was learning that was "the wrong way to do something" wasn't necessarily wrong at all. A great example of this is what you quench in. Before our metallurgy guy came around we had people quenching in used motor oil concoctions and calling it the holy grail and you should buy their knife because they're so super they could cut Chuck Norris. So, the guru explains to the masses about pearlite and the perils of quenching in questionable fluids. He doesn't care what people quench in, he just wanted people to stop quenching in bear piss and claiming their knife is better because of it. I believe that all he wanted to do was get the facts out.

So, like I was saying, I came into all this when people were still standing around, slapping their foreheads and saying "good lord, can you believe there are people out there who try to sell their work as legitimate and they don't use Parks 50? Lord..." Our guru just wanted people to stop claiming their work was superior because they forge facing magnetic north. He didn't like his work disparaged for illegitimate reasons. But an unintended consequence of all this is that people like me, who were just walking into all this, came away from it all with the impression that people who quench in canola oil are making inferior work. I honestly thought that quenching in canola oil would make a less than optimal knife, and quenching in Parks would assure a better HT. In retrospect, I realize that I jumped to these conclusions myself without critically reasoning through all the facts. I was spoon fed and happy about it.

A lot of the "BS" isn't "BS". Some of it, like quenching three times, actually do something. And just because there are "better, proper, more industrially accepted" ways to achieve the same results, doesn't make a person practicing this way foolish. Edge packing is probably foolish, but there have been enough "wrong" things that had some truth to them, that I'll keep an open mind about anything at this point. Cryo, for example.

So, anyway, we now have an ironic situation where some of the people who do it "the old way" are being disparaged against by the people with "knowledge" and "science" to back them up and you have a true reversal of the roles. If a person says my knife is "better" because it was heated in a digitally controlled oven and quenched in some special oil, they're claiming a better product than the other guy who doesn't do it that way. But the reality is, these tools don't guarantee you a better knife, and the old tools don't necessarily mean there will be a problem. You still have to know what you're doing.

So, hence the "anti science" group. They're not really anti science. They just don't like people saying "my work is better than yours because I used science". Shoddy science used as a marketing tool is no different than bear piss. (and good lord, don't misunderstand me here, I'm not saying the science crowd are all using shoddy science.)

So, anyway, I don't have a dog in that fight. I personally take a scientific approach. I just hate to see people who have so much to contribute here treated as if they're not doing it right just because they haven't embraced all of the new ways of doing things.
 
Great test Nathan! I've struggled for many years with D2 HT and have had much better results when sending out to a pro like Paul Bos. I follow a similar method as to what you posted but have always wondered the optimum ramping and pre-soak times and temps. You specified that you give your heat treater exact ramp rates. What do you use here for ramp rates and pre-soak steps and how much difference have you seen when following these pre-steps?

Again Nathan, thank you for taking the time to share your results!

Eric, I'll send you a PM.
 
Awesome writeup! So, who would you recommend to send D2 for heat treat? I have a couple bars I haven't touched because I can't heat treat them myself (I can do the heat treat, just don't have the foil or quench plates or liquid nitrogen...). Had just planned on sending them to Bos when I finally got around to using it.
 
So, anyway, I don't have a dog in that fight. I personally take a scientific approach. I just hate to see people who have so much to contribute here treated as if they're not doing it right just because they haven't embraced all of the new ways of doing things.

I appreciate that sentiment, Nathan. I haven't been at this for very long, but I have already developed an appreciation for the older, "less scientific" methods. Partially because I believe that, for me, bladesmithing is an art and, for me, the more I rely on calipers and temperatures that are digitally controlled to the third decimal point, the further I remove myself from the intuitive feel that makes forging a knife the practice of "art". Don't misunderstand, I'm a scientist by training and natural inclination, and I've studied voraciously the metallurgy involved in this process. I know the fundamentals and I understand how they play into forging a blade, but reducing it to a mere application of numbers on a chart and a bunch of digital readouts on quietly humming machines...well, that would remove it from "art" and move it into the category of "work". And since I'm a hobbyist only, I guess I can afford to be a little more focused on the former than the latter.

Plus, I'm cheap :).

I make my knives with a setup that probably cost no more than $400 total. I use 2 separate sections of RR track for my anvils. Not fancy, but they work for what I need them to. I have a POS Harbor Freight 1"x30" sander. I built a propane forge that uses a hair dryer for its blower. I use an old bench top drill press that I've had for 20 years. I've made all my own tongs.

I'm absolutely certain I could make more knives if I had a big-ass 2"x72" grinder setup and digitally controlled heat treating ovens, but I just plain don't want to spend the money. Instead, I try to learn how good blades were made before all that stuff was available. To be able to tell by eye that your steel is "transforming" (that wavery color change that goes through a blade when it's *just right*) is a skill I hope to learn to use more as time goes by. To me, that beats leaving it up to a machine, and there's an unmistakeable cachet to doing it the old way.

But I'm a rank newborn in the company of most of the guys here. I've made maybe 50 knives total, but I learn something new every time I do. Just last weekend, I differentially hardened a blade for the first time, and I'm now seeing what everyone talks about when they say the quench line gets more visible as you polish the blade to finer and finer degrees.

But I'm getting off track. My point is that I know that I must know the science or else it's all voodoo. I also know that amazing blades were made for millennia before digital controls, Parks 50 and liquid nitrogen were available. It is refreshing to hear you say that you, too, can appreciate both sides of the coin. Thank you.
 
Back
Top