Whats The Forums Opinion Of Mad Dog?

Originally posted by Mr. Dave:
Wow! This thread reminds me of grad school. In all the years I was there though I never heard anyone define a perfect test as one that could be analyzed statistically, whatever that means. I suppose that means you have at least 30 samples. That's an interesting definition of perfect, especially considering that the number 30 is used because it's the nearest nice round number of samples yielding a Student's t distribution with enough degrees of freedom so that it's "statistically close enough" to a normal distribution.

It certainly wasn't me that said anything about a sample size of 30. t tests can be used on very small sample sizes. I don't have a table handy but I believe if you have a homogeneous population to sample from, you can go down to 5 or maybe less, I can't remember. I don't have the tables here at home. It may even go down to 4, I can't remember.

Statistically, a "perfect test" is one you can test statistically. I'm not saying you can't do tests that do not require statistics (how many times do I have to repeat this?). There are tests and there are tests you can analyze statistically. There is a difference. Not all data can be statistically analyzed because it has to meet certain criteria (repeating myself again). Sample size happens to be one of those. It doesn't mean you can't learn from 1 sample or 2 samples. It just means it's impossible to ANALYZE STATISTICALLY. You can wave your hands at the data, you can point at it, you can speculate about it, you can combine it with other data and make strong inferences, but some data you just can't ANALYZE STATISTICALLY. This is the only point I'm trying to make.

If it can be analyzed, please show me the specific statistically analysis and I will fold my tent and slink away. I don't want another analogy, I want THE data--statistically analyzed. I'm not from Missouri but you're just going to have to show me.


------------------
Hoodoo

The low, hoarse purr of the whirling stone—the light-press’d blade,
Diffusing, dropping, sideways-darting, in tiny showers of gold,
Sparkles from the wheel.

Walt Whitman
 
Gentlemen,
In the not-so-far-away Middle Ages one of the great points of contention: How many angels can dance on the head of a pin?

Now, the first problem that arose was a determination of the size of the pins head.
And so it goes.......
smile.gif

Dan
 
Dave :

I never heard anyone define a perfect test as one that could be analyzed statistically

I have never heard anyone claim to have done a perfect experiment, but data collection and experimental design is a vital part of statistics. Even analysis is never perfect as subjective decisions have to be made which will draw upon your experience. Statistics is not a methodology which allows you make a black box and churn out conclusions - however a lot of people think it is.

As for blade sharpening, I sharpen over 100 knives a month, mostly mine and mostly not practical or needed. Often it is just so I will have 100% performance before I make a comparision so I have to regrind the edges. A lot of this is experimenting with different methods and grits as well.

Hoodoo, I never said makers should be perfect in fact I described the opposite several times. To clarify once again - a competent maker would have a rigid QC set in place so that the variance in the population of blade performance would be very small and the population would as well be very cropped. This results in a one data point sample being a very robust estimator of the mean (with the dependancy that the maker be allowed an examination in the very rare case of an outlier [which will be far below your 5% cutoff] and common sense being used such as you don't use a very old review as a judge of current ability, etc., and the rest of what I said in the above.

Back to McClung's blades, in additional to the reference I made to Allen Blade being able to offer the features Tickblade described, Ed Schott is another such maker. For some reason I had the impression that his prices were much higher which is why I never brought him up. He also has a very low wait time - which probably will not stay that way for very long.

-Cliff
 
Hoodoo, I never said makers should be perfect in fact I described the opposite several times. To clarify once again - a competent maker would have a rigid QC set in place so that the variance in the population of blade performance would be very small and the population would as well be very cropped. This results in a one data point sample being a very robust estimator of the mean (with the dependancy that the maker be allowed an examination in the very rare case of an outlier [which will be far below your 5% cutoff] and common sense being used such as you don't use a very old review as a judge of current ability, etc., and the rest of what I said in the above.

Cliff, I think we will have to agree to disagree. As I read it, your assumption of 1 as a robust sample size would brand any maker that has ever had a blade returned due to failure as incompetent, i.e. if I buy a knife from maker X (my sample size of 1) and it fails under normal use, then by your reasoning, the maker is incompetent. Sounds like a perfect world to me.

I wonder how many custom knife makers meet this criteria? I read not too long ago about a maker who used a certain brand of epoxy to glue his handles on and a bunch of them were being returned because the handles were popping off. He changed epoxy brands. Is he an incompetent maker or is he learning from experience and increasing the variance in the population of knives he has produced?

Another time someone complained about a high dollar folder he had bought from a maker and the maker replied that he had been going through some health problems at the time and this knife slipped through his QC. Is he an incompetent maker or is this the kind of variance we can expect in a population? Do knife makers have personal and financial problems that could affect their work and if so, should we brand them as incompetent based on a sample of their work of n=-1?

Anyway, as I said before, I think we will have to agree to disagree. And I think we only disagree on what constitutes a robust sample size for estimating the mean of a small, specific population.

I'm in no way downplaying your tests, which I find fascinating. And when you combine your work with Turbers and other anecdotal information, the arguments are compelling. But when you say I think these tests meet the 5% criteria as opposed to "I calculated the p value," these are two different claims. The first is opinion, the second is a mathematical calculation.
 
I find Cliff's assessment of Mad Dog's Knives very interesting, considering his attitude towards knife performance in general. I remember one time when he was over at my house for supper...
At first he was the perfect guest, he had a glass of 7-up and was quite polite in his table manners. As soon as we gave him his steak, however, he became very cantankerous. He just stared at his plate for the longest while, so I asked him what was wrong. He told me that he refused to eat with a common steak knife, explaining that in his house he won't allow a knife in the kitchen until it has passed his requirements. Not wanting for this to ruin the evening, I got out my grandmothers old steak knife set, figuring that he would respect the detailed embroidery on the blade, and artful etching on the hardwood handles. I passed him the knife, figured that he would be impressed, and anticipated that we would get on with the evening.
No sooner had the knife changed hands when Cliff bolted from the table, down the hall and out the back door! I followed after him only to find him on his knees in my backyard digging furiously into the ground with my priceless heirloom! Deeply in shock I just stood and watched the whole scene unfold.
Cliff dug for a few minutes and then ran his fingernail across the edge of the once untarnished blade. He then yanked a yellow-papered notepad out of his breast pocket and started scribbling frantically. Without wasting any time he then ran across the lawn to the edge of our property and found two old boards that were nailed together. He placed the board on the ground, kneeled on it with one knee, and began attempting to pry the two boards apart with my grandmother's steak knife! The boards moved only a small amount when the knife began bending. Just when it appeared as if the knife could bend no further, Cliff again yanked out his notepad (which I could now see had a picture of Blossom on the cover) and began his shorthand scribbling again.
With that overwith, Cliff then began chopping at the leg of our picnic table with the knife. By now his focus had deepened even more and he began making some strange gutteral noises with each downstroke. At about fifty chops he grew quite. His head rotated a full 180 degrees to look straight at me standing directly behind him. He began explaining to me that when chopping one must not have the limp-wristed grip that some people advocate, but rather a firm, powerful grip that can only be developed through training. I didn't know what to say, but as I was clumsily searching for words Cliff collapsed in a heap by the picnic table.
I got him into the house (which was no easy task, considering his frugal eating habits since amassing an entire set of "acceptable" kitchen knives) and began wipping his head with a cold cloth. Eventually, Cliff was coherent enough to explain to me that he had not taken his Gingko Biloba for a few days and apparently had collapsed of nervous exhaustion. He motioned to his bag, where I found the Gingko Biloba and some Ginseng, which I administered to him immediately.
Curiously, I also found in his bag a few bran muffins (which appeared to have been there for some time), a 1997 spiders calender, some sharpening stones and a Mad Dog Tusk. In the bottom of the bag Cliff had left himself a little note. It was a warning not to use the Tusk for chopping or prying, as this voids the "unconditional" warranty, in addition he said not to attempt to cut anything of any hardness, such as staples or bone, as this may cause the blade to chip or fracture. I guess that's why he didn't trust the Tusk to cut up his steak.
 
I wouldn't say that 1 knife spells incompetence, because nobody's perfect. But 2 in a row, one of which was hand picked to replace the first defective one doesn't exactly do much to bolster confidence, IMHO....

Second, we should all strive to make improvements in whatever we do. Given all of the information we know about 01, hard chrome, and the epoxy / G10 bonding, do you think that improvement is honestly being made in these areas?

Spark

------------------
Kevin Jon Schlossberg
SysOp and Administrator for BladeForums.com

Insert witty quip here
 
>>Some months ago there was a major kafuffle about Mad Dog knives after some testing was done by a BF member. The manufacturer of MD knives took it personally and accused the tester of using a stolen inferior knife. Things just got worse from there. Flame wars ensued, threads were locked, on other forums certain BF members were banned or had their posting deleted. Things got personal, nasty and they dragged on. NO ONE HERE WANTS TO SEE THAT HAPPEN AGAIN!<<

I want to see that happen again, cause I didn't catch all of it the first time
smile.gif
 
Originally posted by Spark:
I wouldn't say that 1 knife spells incompetence, because nobody's perfect. But 2 in a row, one of which was hand picked to replace the first defective one doesn't exactly do much to bolster confidence, IMHO....

I couldn't agree more. It makes a lot of common sense. It's just not a statistical analysis/conclusion.
smile.gif
I never said it had to be, I just said it wasn't.



------------------
Hoodoo

The low, hoarse purr of the whirling stone—the light-press’d blade,
Diffusing, dropping, sideways-darting, in tiny showers of gold,
Sparkles from the wheel.

Walt Whitman
 
Tuff,

Pay attention.
wink.gif


I have conceived of the ultimate test to put the Mad Dog knife corrosion issue to bed once and for all. We lock Sparky in a flying saucer with only a Mad Dog knife, a big tank of sea water, and as many rocks and boards as he wants. We give him a week's worth of food and water, and he can only escape by removing the handle and using the strangely grooved tang to unlock the flying saucer door. My money says he never makes it out.

Now there's a test that I would donate one of my own Mad Dog knives for. Hee, heeee!
 
Hoodoo, I think you have hit the nail on the head with one of your above posts.

Scenario 1 - Your ONE knife failed. I don't recall Cliff ever making the comment that ONE failed knife made a maker incompetent. I can't think of any of us drawing that conclusion. However, when the maker admits that the first knife was defective, hand pics a second and THAT knife fails under the same tests, and then the maker starts yelling about how his "can and has done it all knife" got abused, it DOES raise an eyebrow or two. I still wouldn't call a maker incompetent at this point, but would do more digging into what was going on. Something like, "I wonder if 01 taken to 62 rockwell is too brittle for a knife this size that's supposed to be able to "do it all?"

Scenario 2 - The maker had some epoxy problems. Was it the maker's fault? Perhaps, perhaps not. Either the maker didn't do his homework to see if there were other BETTER options out there (not unlike a knifemaker I could name) or he was sold by the hype of marketing after hearing that this epoxy "could and has done it all." (Like some knife buyers I could name.) Whichever is the accurate statement, one important theme rings through. THE MAKER ADMITTED HIS FAUX PAS AND CHANGED HIS PRODUCT FOR THE BETTER.

Scenario 3 - Another maker adimitted fault and made it right. Wow, what a concept.

All that arguing about what makes a statistical sample is null and void in the face of the question asked. I don't have to hand my hand burned (or my wallet) more than once to decide that statistically, the experience sucks.

What is MY OPINION of Mad Dog?

As a company - I have handled a few of his knives and they are a dream in the hand. Do I like 01 steel he uses? Nope. (Especially when it's that hard.) Do I like or trust hard chrome? Nope. Not after my ONE BAD experience with it. (Yeah, statistically, that isn't a valid sample, but tell that to my ruined .45.) Do I like his prices? Nope. For me there are better values out there. Do I like his sheaths? Yep. Very nice. Would I buy one? Not one of his 01 models. Perhaps a ceramic neck knife. But to get a warranty I would have to be the original retail purchaser. (That REALLY sucks.)

As a man - I haven't seen a lot in his behavior on the forums that impresses me. Statistically, this might not mean alot, but I am working with the only data I have. I would liken it to my experience with my wife's grandmother. I only knew her after she was old and sick and class A witch to be around. I didn't want to be around her, I didn't like her, I didn't like the way she treated my wife or the rest of the family. I didn't even want to go to her funeral. The family was upset with me because they insisted that she was a sweet lady. Problem is, I never saw it. Some people say Mad Dog is a great guy. Maybe so, but I've yet to see it.

It isn't the failing of the product that makes me shy away, it's the failing of the "man" behind the product.
 
HooDoo :

As I read it, your assumption of 1 as a robust sample size would brand any maker that has ever had a blade returned due to failure as incompetent

Which is why I stated (and you quoted) :

with the dependancy that the maker be allowed an examination in the very rare case of an outlier

In fact you don't even need to do this if you just want to settle for a 95% confidence conclusion (your particular choice) - unless of course you think that a maker on average screws up 1 out of every 20 blades. So not only do I meet your 5% cutoff, I easily exceed it and my interpretations will be much more robust. Of course if you have some information that leads you to belive that there are makers screwing up 1 out of every twenty blades or more, please post, I would assume such information would be appreciated by all.

Statistics must include everyday information to make inferences about the population. You just don't blindly use collected data. For example if you are trying to determine the average weight of 2 year old pigs and one in the sample has only 3 legs (life on a farm isn't all roses and sunshine) you don't include that pig as normal pigs have 4 legs. Of course if pigs can randomly have 3 or 4 legs you would include it.


[referring to Spark's comment]

It's just not a statistical analysis/conclusion

Can you quote me some references defining what statistics means in the way you are presenting it, because I have three books in front of me which give a very different one. Here is one "Statistics refers to the methodology for the collection, presentation and alalysis of data and the use of such." One books is a pure math analysis, very formal with complete background formulation, one is applied which is mainly method but very robust, and one is just for Business/Economics and is very simplified.

Here is an example of a trivial statistics problem. You have a dice in front of you. It either contains all ones, or 3 ones and 3 twos. You can only see one face at a time. You are asked to determine which type of dice you have. You make a roll and it turns up a two. This is a sample size of 1. Can you reach a definate conclusion of which type of dice you have (the answer is yes).

Here is a second question and a more interesting one. You keep rolling and it keeps getting a one. Can you reach a definate conclusion about which dice you have (the answer is no). What can you do - make a conclusion and state "This dice is all ones with a X level of confidence". Of course if you are given N number of such dice and this keeps happening eventually you will be wrong. Such is the nature of statistics, and everything else.

The first situation is no more or less a statistical problem than the second, it just has a very easy solution as you have a 100% probability event which is always nice.

Here is a relevant example to this thread using trivial probability theory - can MD refuse to replace or refund when one of his knives fail. Yes. And you only need a sample size of 1 to determine this (one time when he has refused).

Puncture_wound by the way, for the powers that be, is a friend of mine.

-Cliff


[This message has been edited by Cliff Stamp (edited 04-28-2000).]
 
Puncture_wound, that was very funny. So what you are trying to say is that your family steak knife beat out the TUSK. Did the edge on the steak knife chip? Was Cliff able to bend it betond 40 degrees without breaking?

SteveH and the4th, hard chrome on guns has always been used for the wear resistance that it provides. The number of chrome lined barrels out there is countless. However, the problem with a lot of those hard chromed guns was that they way it was attached to the gun was by first nicke/silver plating onto the gun and then hard chrome plating onto the silver/nickel plating. Of course this only made the hard chrome as good as the other plating. I would eventually chip out and rust would go underneath real quick. But the chrome could be removed. However, this process if it didn't chip slowed rust very well, because the nickle/silver was a great barrier, though fragile compared to the following process:

Then a process called Hard Chrome Bonding came out. This method is probably the best HC method available. It actually bonds the Hard Chrome into the "pores" of the steel, which means that it does not change the tolerance of the steel much. In a gun with sliding parts, this is critical sometimes.

I have heard of people who had "stainless steel" guns in Florida getting this HC and it worked well against corrosion, only because the stainless steel is there and there are no open areas like a cutting edge for corrosion to get in. HC on a carbon steel gun would not be advisable for corrosion resistance.

What does this have to do with anything, I have no idea. Oh well.
 
Let's try our best not to turn this into a discussion of Mad Dog's personality. He didn't get that nickname for loving and being loved by everybody, and some folks like to deal with him, and some folks don't, and that's probably all that needs to be said about it.


------------------
- JKM
www.chaicutlery.com
AKTI Member # SA00001
 
In fact you don't even need to do this if you just want to settle for a 95% confidence conclusion (your particular choice) - unless of course you think that a maker on average screws up 1 out of every 20 blades. So not only do I meet your 5% cutoff, I easily exceed it and my interpretations will be much more robust. Of course if you have some information that leads you to belive that there are makers screwing up 1 out of every twenty blades or more, please post, I would assume such information would be appreciated by all.

This almost makes sense. And it would if you only had a pool of 20 to sample from. You are assuming apriori knowledge that you don't have about the variance of a particular population much greater than 20. And since you've extended this to all knife makers, you are actually assuming apriori knowledge about the variance of a population of knives of every custom knife maker. This borders on omniscient. You are not sampling from a pool of 20 knives Cliff. You are sampling from a pool of X knives.

Cliff, here is what constitutes statistical validity. Try publishing your results in a peer reviewed scientific journal. That's where I publish my research and it just doesn't get published with sample sizes of one or even two. In fact, if you know of a peer reviewed journal that publishes experimental research with hypothesis testing where sample sizes of 1 or 2 are accepted, I would appreciate receiving some specific journal references. I would really be interested in how they do it. I've had papers rejected for sample sizes less than 15 and if there is a way around it, I'd love to know what it is. I have 14 years as a scientist and I've yet to figure out how to get around this. In fact, I've never even read a single paper of an experiment that had experimental results that were tested statistically yet had a sample size of n=1.(and I have around 4,000 papers in my filing cabinents and over 50,000 abstracts on my hard drive) I can only think that I'm reading the wrong journals. Which one's do you publish in? In the kind of work I do, we don't publish experimental results without p values and p values have to be calculated and as far as I know, there is no statistical test where you can calculate a p value with n=1. You can't even use resampling statistics on this one.
smile.gif


Cliff, I'm not disputing that your conclusions are highly probable. What I am saying is that you have not quantifed them in a statistically testable format that would pass peer review and you can't do that without an estimate of the variance in the population. And by estimate, I don't mean guess, I mean a standard error of the mean or standard deviation. I know you think you know what the variance is but thinking and knowing is the difference between not publishing and publishing. And if you want to make the claim that your results are SCIENCE, Cliff, then you have to meet the scientific criteria of peer reviewed journals. Otherwise, it's not scientifically acceptable, although it certainly can be qualitatively acceptable, even highly probable. But you can't state with any real rigor just what that probablitiy is. If you think your data is in fact scientifically acceptable, I suggest you publish it. Let me know when the proofs come back. I'd love a copy.

Now in the particular case of MD, where you are talking about a hand selected knife, no doubt you have severely minimized the variance. Does this increase the probablity of your conclusions? Yes. But suppose it wasn't MD and it was another maker and this maker has someone else do his heat treat and that person screwed it up and the maker was unaware of this? How does that fit in with your scheme of omniscient knowledge about the variance in the populations of knives of all custom knife makers?


I wish I had such omniscient knowledge. BTW, the 95% confidence was not my choice, it's the standard set by almost all peer reviewed journals and has been since the 20's.

I don't think anyone here needs the kind of statistical rigor I've been talking about to make up their minds about the validity of your tests. Your tests speak for themselves.
But if you want to claim them as science, then show me they can pass the kind of statistical examination that all scientific data has to pass if it wants to be accepted by the scientific community. And I'm not saying your work is not rigorous or useful but I am saying that claims of statistical validity need to be analyzed.

Rather than bore BF members with more unending statistical drivel, I suggest we continue this part of the discussion by email. I would be glad to continue discussing this as much as you want, but I suspect the rest of the natives are getting restless.


------------------
Hoodoo

The low, hoarse purr of the whirling stone—the light-press’d blade,
Diffusing, dropping, sideways-darting, in tiny showers of gold,
Sparkles from the wheel.

Walt Whitman

[This message has been edited by Hoodoo (edited 04-29-2000).]
 
Hoodoo:

You are assuming apriori knowledge

Considering the interaction with the maker the necessary fault level to bring the work outside the 5% cirteria becomes much too extreme to be in any way sensible. But yes, even without this I think that the variance is low enough that I would be confident to state that the p value would be 5% (in fact I would go much lower than that).

The only way my results would not pass at the 5% level is if blades were faults at almost the 25% level as currently I have a set value of 2 faults before I would demand a refund and write the blade off. It is totally unreasonable for me to think that makers are faulting blades 1 out of every 4.

if you know of a peer reviewed journal that publishes experimental research with hypothesis testing where sample sizes of 1 or 2 are accepted, I would appreciate receiving some specific journal references.

This is a trivial question. I already answered it in the above but again I will repeat it. You only need a sample size of one when you are dealing with a probability of 100% for a specific event, ie. zero variance (variance determines sample size needed for your confidence level). You can use sample sizes of 2 without this, the t values for the 95% confidence intervals are in the book I referenced above.

People make judgements like I have all the time in scientific work, you don't have to calculate the exact probabilities or even make any attempt to measure them if you can estimate that they are *much* lower than your criteria. You do these estimates based on your experience which you must bring into the work. As in the above example where you rule out the pig with 3 legs. There is no p value calculated but this is statistics. You look back upon your experience and conclude that particular pig should not be included as it is an unnatural deviation.


[publish]

I would really be interested in how they do it.

Just make sure I would be one of the editors.

Which one's do you publish in?

For my current field of work - Can. J. Phys., JQSRT, J. Mol. Spect, Phys. Rev., Phys. Rev. Lett., Physica., etc. . I expect this to change in a year or two as there are some other fields I would like to explore and other people I would enjoy working with.

there is no statistical test where you can calculate a p value with n=1.

There are cases where no specific statistic needs to be calculated. This does not mean that it is not statistical work but I think this is the defination you are using - and I am beginning to understand your perspective.


BTW, the 95% confidence was not my choice, it's the standard set by almost all peer reviewed journals

That should vary depending on what you want your Type I and Type II errors to be. Just because a lot of people ignore this and use 5% without consideration doesn't make it the best thing to do. Would you think that a 5% p value would be alright for a death sentance, obviously not.

I read last year a very nice discussion in a medical journal where the author discussed this very issue and debated at which level results should be evaluated at. I can't remember exactly at what level he was for, but I think it was 99+% based on the fact that branding noneffects being much less serious than ignoring a harmful ones. I wish I have noted that article as it would have been useful as a refence when teaching introductory analysis as it clearly illustrates that many criteria are very subjective.

-Cliff


[This message has been edited by Cliff Stamp (edited 04-29-2000).]
 
Hoodoo, Cliff,

Please continue this here, I am only catching the fringes, but I am really loving this. It is so amazingly wonderul to see a couple of professionals agreeing in general, but disagreeing about a very specific issue, and despite their disaggreement, staying polite and letting the information flow.

And I am being serious.

------------------
Thank you,
Marion David Poff aka Eye, Cd'A ID, USA mdpoff@hotmail.com

My Talonite Resource Page, nearly exhaustive!!
My Fire Page, artificial flint and index of information.

"Many are blinded by name and reputation, few see the truth" Lao Tzu
 
got nothing to say.


Just had to have my name on one of the longest threads here in a long time
smile.gif
 
MDP,
I'm impressed. I would have thought most people's eyes would have glazed over a long time ago (I'm sure there are many who have). Anyway...into the void...


You only need a sample size of one when you are dealing with a probability of 100% for a specific event, ie. zero variance (variance determines sample size needed for your confidence level). You can use sample sizes of 2 without this, the t values for the 95% confidence intervals are in the book I referenced above.

Yes I checked. The d.f. goes down to 1. But did you look at the critical value? From 3 to 1 degrees of freedom, the critical value rises exponentially. Again, I have never seen a t test published with 1 degree of freedom. And with n=1 and degrees of freedom = n-1, you get 0 degrees of freedom and hence no critical value and hence no t test.

People make judgements like I have all the time in scientific work, you don't have to calculate the exact probabilities or even make any attempt to measure them if you can estimate that they are *much* lower than your criteria. You do these estimates based on your experience which you must bring into the work.

I partially agree with you. But conclusions base on data without statistical testing are usually couched in much more speculative terms. With statistical analysis, we can state mathematically what our confidence levels are. Without statistical analysis, we can speculate.

For my current field of work - Can. J. Phys., JQSRT, J. Mol. Spect, Phys. Rev., Phys. Rev. Lett., Physica., etc. . I expect this to change in a year or two as there are some other fields I would like to explore and other people I would enjoy working with.

In these papers are you the primary author? I'd be interested in reading some. If you would email me a reference or two that you think might be applicable this this discussion in terms of analysis, I would be interested in reading them. I spent three summers working at the Michigan State Cyclotron when I was a grad student where I was involved in building low pressure multiwire proportional counters for the 4 Pi group there. So I have some familiarity with particle detectors and a little physics. It's actually kinda funny because I remember having a chat about statistics with the head of the lab at that time and I was somewhat stunned when he asked me what a t test was. Evidently, particle physicists use mainly the Poisson distribution and one other test, I think the Chi Square, but can't remember.

There are cases where no specific statistic needs to be calculated. This does not mean that it is not statistical work but I think this is the defination you are using - and I am beginning to understand your perspective.

Again, I agree--somewhat. There are many scientific claims that can be made and supported by the data without statistical testing. And the history of science is littered with ideas that were once accepted but now rejected. We once thought phlogiston was produced when we burned a candle in a bell jar. Great idea and it seemed plausible because of its explanatory power. So many ideas are accepted simply because they have great predictive and explanatory power. But phlogiston died when we discovered oxygen.

But scientists still have to tread lightly here. String theory, for instance, has great explanatory power but where is the data?

During PhD examinations, the question I hear most frequently is: Why didn't you do the experiment?. I can understand why you didn't increase your sample size. You had confidence in your testing and your results. But recall that you extend this to all makers. Now all makers are branded as incompetent if one of their knives fail your test? For me, this goes beyond common sense and seems extreme. I just don't have a lot of confidence in this notion.

For me, "statistical work" is statistical testing. It's more than just presenting information and thinking speculatively about it. This is the foundation of modern science. Disproving hypotheses. In the past 50 years in the biological sciences (my main field), observational data have been marginalized and the emphasis is on experimentation and backed up by statistical testing (unfortunately, my opinion is that observaional data has been too marginalized but that's another story).

As the great philosopher of science Karl Popper would say, we don't prove things in science, we disprove things. That's why we accept things in terms of probability. We talk of things as being highly probable. And we used statistical analysis to back up our claims. Linus Pauling can speculate all he wants (or wanted) about vitamin c curing the common cold. All his experience and knowledge told him it was so. Some people "believed" him because he was a nobel laureate. But I'm one of those scientists (as are most of the scientists that I know personally and have read and studied in my field) that are pig headed and say: Show me the data and the statistical analysis. When scientists move into the handwaving arena, that's great because they can point the way to new ideas, but the data and grunt statistical analysis has to flow from there. Sure, we accept many ideas in science that have not been tested statistically. But we usually (and should) have less confidence in them. When we test our data statistically, we can state clearly and unambiguously what the confidence level is. That's the p value.

That should vary depending on what you want your Type I and Type II errors to be. Just because a lot of people ignore this and use 5% without consideration doesn't make it the best thing to do. Would you think that a 5% p value would be alright for a death sentance, obviously not. I read last year a very nice discussion in a medical journal where the author discussed this very issue
and debated at which level results should be evaluated at. I can't remember exactly at what level he was for, but I think it was 99+% based on the fact that branding noneffects being much less serious
than ignoring a harmful ones. I wish I have noted that article as it would have been useful as a refence when teaching introductory analysis as it clearly illustrates that many criteria are very
subjective.

Yes, this is a great issue. But I don't think we are talking about life and death issues unless you are talking about ruining some knifemaker's life [which could be the result of a type I error]. Then perhaps p < 0.01 or p < 0.001. Which would make your position far less tenable and guard against the type I error. I'd love to look up the critical value for p < 0.01 and degrees of freedom = 1. It surely must be astronomical.

Much of my research is small sample work and I usually work with n between 15 and 30 in an ANOVA design and p is often much less than 0.001 when results are significant (but I'm perfectly happy with p < 0.05 and in some instances, would argue that p < 0.1 should be conisdered. But I know my confidence level because I can calculate it. I could speculate about it and I might be right because I have experience in my field, but if I can do the required experiment with the proper n and don't and just rely on my "hunches" I would never get my work published and I would never get funded.

But I don't think publishing in a scientific journal and getting funded are at issue here. I've already stated that I find your results compelling. But I'm a bullheaded skeptic and I can't think of anything that I accept with the 100% confidence that you are espousing based on n=1 or n=2.

So I will again say that I think we will have to agree to disagree and let it go at that.

BTW, I grew up on a pig farm and I will attest that it is highly probable that all our pigs had 4 legs.
smile.gif


------------------
Hoodoo

The low, hoarse purr of the whirling stone—the light-press’d blade,
Diffusing, dropping, sideways-darting, in tiny showers of gold,
Sparkles from the wheel.

Walt Whitman

[This message has been edited by Hoodoo (edited 04-30-2000).]
 
Spark made a good point. I think MD is just squattin' around waiting for someone to fork up the 9K for a *generous* testing sample.

He obviously doesn't give a crap about his customers and he doesn't want to improve his products.

He IS running himself into a corner, if he's not there allready, and I can see his diehard fans shoving eachother aside as they run along behind him. What happens is MD stops real quick?
biggrin.gif


------------------
You could put nacho cheese sauce on it...
 
Back
Top