Please understand that when I say that Mr. Stamp's testing isn't scientific, I don't mean to say that it isn't without value. You just have to understand what it is. Without truly scientific knife testing, those potential customers who want some evaluation of a knife they're considering have to make what use they can out of reports such as Mr. Stamp's.
To be scientific, testing must be repeatable. Two different testers must be able to conduct the tests at two different facilities and come up with the same results within acceptable experimental error. This pretty much means getting the human factor out of the testing. That pretty much means machines of some sort. Consider, for example, chopping with a knife. I may swing harder than you do. I may swing with a different "english" if you will, a different style. So, we have to get the human factor out of the test. Consider this suggestion
Some standardizing body would have to design this device and produce a fully-detailed drawing package of it. Anyone who wanted to could, for a small fee, purchase a copy of that drawing package. Then, any machinist could follow those drawings and assemble an exact copy of the machine. My machine would be the same as Spark's machine in every way.
The knife under test is rigidly attached to the machine's arm probably by drilling holes in the knife and bolting it on. The machine's arm is raised up and the energized electromagnet holds it. When the magnet is switched off, the weight pulls the arm down making a prefectly repeatable chop every time. Each chop would be the same as the previous one and, OH AND THIS IS WHAT'S IMPORTANT: each of my chops would be the same as each of Spark's chops even though I'm in Oregon and he's in Florida, even though I skipped breakfast, but he hit the buffet at Shoney's... several times, even though I'm a tough weightlifter and he's a whimp.
And what would the knife chop into? We'd have to standardize that too. And it probably wouldn't be a natural material such as wood since there's to much variability. My guess is that our standardizing body would select a specific type of plastic and we'd use standard-size pieces of it.
Now, with this sort of mechanism, we could conduct scientific tests. My results would be virtually the same as Spark's.
Some may argue that such testing wouldn't be "real world." But, if we design the tests carefully, they'd be close. For example, if the machine's arm were about 2.75 feet, about the length of an average man's arm, if the weight of the arm were specified to be about the weight of an average man's arm, if the weight of the weight that pulls the arm down were selected to produce about the same force as an average man chopping with a knife, if the standard material to chop into were selected to be about the same hardness as an average tree limb, then the test could approximate real-world performance and still be highly scientific. Then, I could really compare two knives side-by-side and Spark could duplicate my tests and get the same result. Then, we'd start to have science!